Open bexxnaz opened 3 months ago
Good question! Actually, both ways work. We include the ###Human
because in the early version of vicuna
, we find that it uses ###Human
. However, we have tried to remove it, and it works too.
Thanks a lot for your reply! I am working on a project where I intend to replace the decoder part of the model with the mT0-xl model (multilingual). However, I have some concerns regarding the usage of the (###human, ###assistant) prefix and its compatibility with this change. If you can help, I would greatly appreciate it.
I do not have enough experience, that's why I follow the design of LLM training. So I suggest you conduct some ablations, or check how other MLLMs work ~
Hello! First of all, thank you for your great work on the videochat2 model.
I have a question about the training part in stage3, particularly in line 274 of the
videochat2_it.py
file. In that line (), it seems that the final target includes the "(###Human: )" prefix. I'm wondering why this prefix is considered in the final labels(targets). I assumed that only the sentences following "(###Assistant: )" should be considered.Currently, the labels sequence looks like this: [-100, ..., -100, 835, 29950, 7889, 29901, -100, ..., -100, 835, 7900, ...]
Could you please provide some clarification on this matter? I would greatly appreciate it.
Thank you!