$ python -m fastchat.serve.cli --model-path /opt/project/fast-chat/Yuan2-2B-Janus-hf
2024-03-01 23:28:08,304 - modelscope - INFO - PyTorch version 2.2.1 Found.
2024-03-01 23:28:08,305 - modelscope - INFO - Loading ast index from /home/maxoyed/.cache/modelscope/ast_indexer
2024-03-01 23:28:08,305 - modelscope - INFO - No valid ast index found from /home/maxoyed/.cache/modelscope/ast_indexer, generating ast index from prebuilt!
2024-03-01 23:28:08,339 - modelscope - INFO - Loading done! Current index file version is 1.12.0, with md5 30f9d6887bb264aa0df846abe2df639b and a total number of 964 components indexed
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
user: hello
assistant: /opt/project/fast-chat/.venv/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:410: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.7` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
warnings.warn(
/opt/project/fast-chat/.venv/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:415: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.0` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
warnings.warn(
/opt/project/fast-chat/.venv/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:427: UserWarning: `do_sample` is set to `False`. However, `top_k` is set to `1` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_k`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
warnings.warn(
/opt/project/fast-chat/.venv/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:410: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.7` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.
warnings.warn(
/opt/project/fast-chat/.venv/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:415: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.0` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.
warnings.warn(
/opt/project/fast-chat/.venv/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:427: UserWarning: `do_sample` is set to `False`. However, `top_k` is set to `1` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_k`.
warnings.warn(
Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/maxoyed/.pyenv/versions/3.9.18/lib/python3.9/threading.py", line 980, in _bootstrap_inner
self.run()
File "/home/maxoyed/.pyenv/versions/3.9.18/lib/python3.9/threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/transformers/generation/utils.py", line 1544, in generate
return self.greedy_search(
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/transformers/generation/utils.py", line 2404, in greedy_search
outputs = self(
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/maxoyed/.cache/huggingface/modules/transformers_modules/Yuan2-2B-Janus-hf/yuan_hf_model.py", line 936, in forward
outputs = self.model(
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/maxoyed/.cache/huggingface/modules/transformers_modules/Yuan2-2B-Janus-hf/yuan_hf_model.py", line 766, in forward
layer_outputs = decoder_layer(
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/maxoyed/.cache/huggingface/modules/transformers_modules/Yuan2-2B-Janus-hf/yuan_hf_model.py", line 427, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/maxoyed/.cache/huggingface/modules/transformers_modules/Yuan2-2B-Janus-hf/yuan_hf_model.py", line 310, in forward
cos, sin = self.rotary_emb(value_states, seq_len=kv_seq_len)
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/project/fast-chat/.venv/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
TypeError: forward() missing 1 required positional argument: 'position_ids'
复现过程
按照官方公众号发的教程,使用 FastChat 部署,模型是从 ModelScope 上通过 git 下载的:Yuan2.0-2B-Janus-hf
官方微信公众号文章链接:源2.0适配FastChat框架!企业快速本地化部署大模型对话平台
操作步骤
错误信息
版本信息