ChatGLM2-6B tokenizer can't load?

triton-inference-server / tensorrtllm_backend

The Triton TensorRT-LLM Backend

Apache License 2.0

650 stars 93 forks source link

ChatGLM2-6B tokenizer can't load? #197

Open THU-mjx opened 9 months ago

THU-mjx commented 9 months ago

when i built a ChatGLM2-6B engine to launch the triton-server, an error occurred: Tokenizer class ChatGLMTokenizer does not exist

byshiue commented 9 months ago

Hope this issue is helpful https://github.com/THUDM/ChatGLM2-6B/issues/243.

THU-mjx commented 9 months ago

when i add "trust_remode_code=True" in preprocessing/&postprocessing/1/model.py， another error occurred : Failed to initialize Python stub: AttributeError: can‘t set attribute ’pad_token I tried to fix it with this methods

The server could be built, but can't be used!

byshiue commented 9 months ago

What branch do you use? If you don't use the latest main branch, please try it. Please remember updating the tensorrt_llm, too. Also, you need to rebuild the tensorrt_llm and the engine when you update the codes.

THU-mjx commented 9 months ago

Using the latest tensorrt_llm and tensorrtllm_backend, I failed to launch the server with the above error:

THU-mjx commented 9 months ago

I fixed this issue. The latest version of trtllm-backend have to manually set max_batch_size. I have another question : how to set these values?

byshiue commented 9 months ago

You can set 1 first.

THU-mjx commented 9 months ago

Another error happened in preprocessing/postprocessing/model.py （I test several models, the errors are same!） IN model.py, i just added 'trust_remote_code=True'

byshiue commented 9 months ago

Please refer https://github.com/THUDM/ChatGLM3/issues/152.

A possible workaround is preventing assign the pad_token, and replace all pad_token by eos_token.