Open HuuHuy227 opened 4 months ago
Are you using Transformers engine?
Are you using Transformers engine?
Yes
I can reproduce this, but I don't know what happened to Transformers engine, now other formats should work well.
I can reproduce this, but I don't know what happened to Transformers engine, now other formats should work well.
Default generate() function of transformers worked well
This issue is stale because it has been open for 7 days with no activity.
Describe the bug
After lauched model , respone repeated until max tokens Example: when I ask 'hello' it respones 'HelloHowToToToToToToToToToToToToToToToToToToToToToToToToToToToToToToToToToToToToToTo...' until the max token