关于chinese-alpaca-2-7b-64k模型在inference_hf.py推理部署中使用vllm报错的问题

hoohooer commented 6 months ago

提交前必须检查以下项目

[X] 请确保使用的是仓库最新代码（git pull），一些问题已被解决和修复。
[X] 我已阅读项目文档和FAQ章节并且已在Issue中对问题进行了搜索，没有找到相似问题和解决方案。
[x] 第三方插件问题：例如llama.cpp、LangChain、text-generation-webui等，同时建议到对应的项目中查找解决方案。

问题类型

模型量化和部署

基础模型

Others

操作系统

Linux

详细描述问题

# 这是运行命令

python scripts/inference/inference_hf.py --base_model model/chinese-alpaca-2-7b-64k --with_prompt --interactive --use_vllm

依赖情况（代码类问题务必提供）

# 请在此处粘贴依赖情况（请粘贴在本代码块里）

bitsandbytes 0.41.1 peft 0.3.0 sentencepiece 0.1.99 torch 2.1.2 torchvision 0.16.2 transformers 4.36.2

运行日志或截图

# 请在此处粘贴运行日志（请粘贴在本代码块里）

USE_XFORMERS_ATTENTION: True STORE_KV_BEFORE_ROPE: False Traceback (most recent call last): File "/hy-tmp/Aplaca2/Chinese-LLaMA-Alpaca-2-main/scripts/inference/inference_hf.py", line 129, in model = LLM(model=args.base_model, File "/usr/local/miniconda3/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 105, in init self.llm_engine = LLMEngine.from_engine_args(engine_args) File "/usr/local/miniconda3/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 304, in from_engine_args engine_configs = engine_args.create_engine_configs() File "/usr/local/miniconda3/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 218, in create_engine_configs model_config = ModelConfig(self.model, self.tokenizer, File "/usr/local/miniconda3/lib/python3.10/site-packages/vllm/config.py", line 101, in init self.hf_config = get_config(self.model, trust_remote_code, revision) File "/usr/local/miniconda3/lib/python3.10/site-packages/vllm/transformers_utils/config.py", line 35, in get_config raise e File "/usr/local/miniconda3/lib/python3.10/site-packages/vllm/transformers_utils/config.py", line 23, in get_config config = AutoConfig.from_pretrained( File "/usr/local/miniconda3/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1099, in from_pretrained return config_class.from_dict(config_dict, unused_kwargs) File "/usr/local/miniconda3/lib/python3.10/site-packages/transformers/configuration_utils.py", line 774, in from_dict config = cls(config_dict) File "/usr/local/miniconda3/lib/python3.10/site-packages/transformers/models/llama/configuration_llama.py", line 160, in init self._rope_scaling_validation() File "/usr/local/miniconda3/lib/python3.10/site-packages/transformers/models/llama/configuration_llama.py", line 180, in _rope_scaling_validation raise ValueError( ValueError: rope_scaling must be a dictionary with with two fields, type and factor, got {'factor': 16.0, 'finetuned': True, 'original_max_position_embeddings': 4096, 'type': 'yarn'}

hoohooer commented 6 months ago

依赖情况已更新，还是报相同的错误。 bitsandbytes 0.42.0 peft 0.7.1 sentencepiece 0.1.99 torch 2.1.2 torchvision 0.16.2 transformers 4.36.2

iMountTai commented 6 months ago

chinese-alpaca-2-7b-64k需要通过文件夹下的modeling_llama_yarn.py加载，这是自编写的模型文件，vllm可能不支持这个操作。

hoohooer commented 6 months ago

chinese-alpaca-2-7b-64k需要通过文件夹下的modeling_llama_yarn.py加载，这是自编写的模型文件，vllm可能不支持这个操作。

请问是哪个文件夹？我在项目文件中没找到modeling_llama_yarn.py。非常感谢！

iMountTai commented 6 months ago

与模型权重在同一个目录下，不在项目文件中

github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.

github-actions[bot] commented 5 months ago

Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.

ymcui / Chinese-LLaMA-Alpaca-2