Closed TJ-Ouyang closed 1 month ago
Hi, @TJ-Ouyang , That is strange, apparently the token number of the output is much smaller than the max_tokens set. Would you please share the content of your test.py so we may help with debugging?
Hi, @TJ-Ouyang , That is strange, apparently the token number of the output is much smaller than the max_tokens set. Would you please share the content of your test.py so we may help with debugging?
We have solved the problem by setting "kwargs_default = dict(do_sample=False, max_new_tokens=512, top_p=None, num_beams=1)" before self.model.chat at line 359 in internvl_chat.py. We used ipdb and found the process did not jump into chat_inner function, where max_new_tokens was set before.
@TJ-Ouyang ,
I have figured out the cause of this problem: the default max_new_tokens
for InternVL2-40B is just 20. I have changed the default kwargs value so that other users will not meet this problem again.
When I use InternVL2-40B to do single image inference using personal data, the output is truncated.