Open xd2333 opened 4 months ago
You're correct! It seems like max_seq_length
's default of 4096 is auto scaling TinyLlama, causing bad outputs - I'll fix this asap - thanks for the report!
You're correct! It seems like
max_seq_length
's default of 4096 is auto scaling TinyLlama, causing bad outputs - I'll fix this asap - thanks for the report!
Hi unslothai, thx for fixing that! tinyllama-chat seems better not but i found Qwen1.5-7B-Chat still not well
and here is the case too: https://colab.research.google.com/drive/1dxGKB-c3U8BYX-m2rQie8R12--0-JQMs?usp=sharing#scrollTo=47OE5BgPB6Wm
hi unslothai, i got different inference result when using unsloth, i'v tested qwen1.5-chat and tinyllama-chat and got same issue, generate by unsloth always get a bad result compare with transformers and dont know why
and here is my case: https://colab.research.google.com/drive/1dxGKB-c3U8BYX-m2rQie8R12--0-JQMs?usp=sharing