训练完成后输入长文本的支持

Reminder

[X] I have read the README and searched the existing issues.

System Info

对qwen2-72b-instruct 训练完成后并且量化gptq-4位，使用以下命令部署没有问题，问答也ok CUDA_VISIBLE_DEVICES=0,1 API_PORT=7864 llamafactory-cli api \ --model_name_or_path /workspace/chat-1.1 \ --template qwen \ --infer_backend vllm \ --vllm_enforce_eager true 但是想支持长文本输入，根据qwen2官方加了配置如下