[Bug] 使用llamafactory训练 glm4的lora-rm模型、glm4的freeze-kto模型等,出现部署问题ValueError: Input None is not valid. Should be a string, a list/tuple of strings or a list/tuple of integers. #2369
[x] 1. I have searched related issues but cannot get the expected help.
[x] 2. The bug has not been fixed in the latest version.
[x] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
Traceback (most recent call last):
| File "/opt/conda/lib/python3.10/site-packages/starlette/responses.py", line 261, in wrap
| await func()
| File "/opt/conda/lib/python3.10/site-packages/starlette/responses.py", line 250, in stream_response
| async for chunk in self.body_iterator:
| File "/opt/conda/lib/python3.10/site-packages/lmdeploy/serve/openai/api_server.py", line 503, in completion_stream_generator
| async for res in result_generator:
| File "/opt/conda/lib/python3.10/site-packages/lmdeploy/serve/async_engine.py", line 576, in generate
| prompt_input = await self._get_prompt_input(prompt,
| File "/opt/conda/lib/python3.10/site-packages/lmdeploy/serve/async_engine.py", line 525, in _get_prompt_input
| input_ids = self.tokenizer.encode(prompt, add_bos=sequence_start)
| File "/opt/conda/lib/python3.10/site-packages/lmdeploy/tokenizer.py", line 600, in encode
| return self.model.encode(s, add_bos, add_special_tokens, **kwargs)
| File "/opt/conda/lib/python3.10/site-packages/lmdeploy/tokenizer.py", line 531, in encode
| return super(ChatGLM4Tokenizer, self).encode(s,
| File "/opt/conda/lib/python3.10/site-packages/lmdeploy/tokenizer.py", line 366, in encode
| encoded = self.model.encode(s,
| File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2629, in encode
| encoded_inputs = self.encode_plus(
| File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3037, in encode_plus
| return self._encode_plus(
| File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 719, in _encode_plus
| first_ids = get_input_ids(text)
| File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 705, in get_input_ids
| raise ValueError(
| ValueError: Input None is not valid. Should be a string, a list/tuple of strings or a list/tuple of integers.
Checklist
Describe the bug
Traceback (most recent call last): | File "/opt/conda/lib/python3.10/site-packages/starlette/responses.py", line 261, in wrap | await func() | File "/opt/conda/lib/python3.10/site-packages/starlette/responses.py", line 250, in stream_response | async for chunk in self.body_iterator: | File "/opt/conda/lib/python3.10/site-packages/lmdeploy/serve/openai/api_server.py", line 503, in completion_stream_generator | async for res in result_generator: | File "/opt/conda/lib/python3.10/site-packages/lmdeploy/serve/async_engine.py", line 576, in generate | prompt_input = await self._get_prompt_input(prompt, | File "/opt/conda/lib/python3.10/site-packages/lmdeploy/serve/async_engine.py", line 525, in _get_prompt_input | input_ids = self.tokenizer.encode(prompt, add_bos=sequence_start) | File "/opt/conda/lib/python3.10/site-packages/lmdeploy/tokenizer.py", line 600, in encode | return self.model.encode(s, add_bos, add_special_tokens, **kwargs) | File "/opt/conda/lib/python3.10/site-packages/lmdeploy/tokenizer.py", line 531, in encode | return super(ChatGLM4Tokenizer, self).encode(s, | File "/opt/conda/lib/python3.10/site-packages/lmdeploy/tokenizer.py", line 366, in encode | encoded = self.model.encode(s, | File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2629, in encode | encoded_inputs = self.encode_plus( | File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3037, in encode_plus | return self._encode_plus( | File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 719, in _encode_plus | first_ids = get_input_ids(text) | File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 705, in get_input_ids | raise ValueError( | ValueError: Input None is not valid. Should be a string, a list/tuple of strings or a list/tuple of integers.
Reproduction
CUDA_VISIBLE_DEVICES=1 lmdeploy serve api_server saves/glm4/lora/train_2024-06-27-13-34-50-rm22/export_model/ --server-port 23333 --max-batch-size 128 --tp 1 --model-name xxxxx
Environment
Error traceback
No response