Traceback (most recent call last):
File "/home/lili/project/chatglm_cpp/examples/yhy_test.py", line 81, in
main()
File "/home/lili/project/chatglm_cpp/examples/yhy_test.py", line 62, in main
for chunk in pipeline.chat(messages, **generation_kwargs):
File "/root/anaconda3/envs/glm4-test/lib/python3.10/site-packages/chatglm_cpp/init.py", line 127, in _stream_chat
for next_token_id in self._stream_generate_ids(input_ids=input_ids, gen_config=gen_config):
File "/root/anaconda3/envs/glm4-test/lib/python3.10/site-packages/chatglm_cpp/init.py", line 115, in _stream_generate_ids
next_token_id = self.model.generate_next_token(input_ids, gen_config, n_past, n_ctx)
RuntimeError: /tmp/pip-install-088ypyu1/chatglm-cpp_5f184e002ea14195945da430b48f59b5/chatglm.cpp:767 check failed (std::isfinite(next_token_logits[i])) nan/inf encountered at lm_logits[0]
os:ubuntu20.04 cuda:118+torch2.3 transformers:4.41.0 models:glm4-9b-chat-1m int8 量化
我有一段很长的文字,发现输入1000字,12000个字就会报错,估计会很多这样的错误,这是为什么呢
Traceback (most recent call last): File "/home/lili/project/chatglm_cpp/examples/yhy_test.py", line 81, in
main()
File "/home/lili/project/chatglm_cpp/examples/yhy_test.py", line 62, in main
for chunk in pipeline.chat(messages, **generation_kwargs):
File "/root/anaconda3/envs/glm4-test/lib/python3.10/site-packages/chatglm_cpp/init.py", line 127, in _stream_chat
for next_token_id in self._stream_generate_ids(input_ids=input_ids, gen_config=gen_config):
File "/root/anaconda3/envs/glm4-test/lib/python3.10/site-packages/chatglm_cpp/init.py", line 115, in _stream_generate_ids
next_token_id = self.model.generate_next_token(input_ids, gen_config, n_past, n_ctx)
RuntimeError: /tmp/pip-install-088ypyu1/chatglm-cpp_5f184e002ea14195945da430b48f59b5/chatglm.cpp:767 check failed (std::isfinite(next_token_logits[i])) nan/inf encountered at lm_logits[0]