[BUG/Help] <title>6b-chat基模型输出bug

基模型推理，任何问题都会在回答问题之后输出训练数据温度低时输出完数据还会一直重复某些句子

No response

按照huggingface上的配置
导入和参数设置 tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, device='cuda')

response, _history = model.chat(tokenizer, request, history=history, max_length= 512, num_beams=1, do_sample=True, top_p=0.8, temperature=0.1)

- OS:Ubuntu 20.04
- Python:3.10
- Transformers:4.33.2
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True

No response

THUDM / ChatGLM2-6B