THU-ESIS / Chinese-Mistral

Chinese-Mistral: An Efficient and Effective Chinese Large Language Model
Apache License 2.0
26 stars 4 forks source link

输出结果的长度小于设置的max_new_tokens #1

Open coderlxn opened 7 months ago

coderlxn commented 7 months ago

用示例代码测试,将max_new_tokens设置为4096,输出结果远小于这个数值,只有 565 个token

image
THUchenzhou commented 6 months ago

用示例代码测试,将max_new_tokens设置为4096,输出结果远小于这个数值,只有 565 个token image

Please adjust the hyperparameters. This is because truncated character is generated during the inference process. You can remove do_sample=True and modify it to:

outputs_id = model.generate(inputs, max_new_tokens=4096)

Alternatively, try different hyperparameters more carefully to meet the needs of different scenarios.

Since the training corpus of the model contains less Chinese text compared to English, it is advisable to avoid setting a large max_new_tokens value in Chinese scenarios.