I have noticed that when performing inference with a batch size greater than 1 using mPLUG-owl, the quality of the generated text is significantly worse compared to when using a batch size of 1. I have thoroughly reviewed the code and couldn't find any potential factors that could cause this issue.
Could you please provide insights into the possible reasons behind this performance difference when using different batch sizes in mPLUG-owl's inference? I would greatly appreciate any suggestions or explanations you can provide.
Hi,
I have noticed that when performing inference with a batch size greater than 1 using mPLUG-owl, the quality of the generated text is significantly worse compared to when using a batch size of 1. I have thoroughly reviewed the code and couldn't find any potential factors that could cause this issue.
Could you please provide insights into the possible reasons behind this performance difference when using different batch sizes in mPLUG-owl's inference? I would greatly appreciate any suggestions or explanations you can provide.