模型X-D-Lab/Sunsimiao-Qwen2-7B 推理报错

yaleimeng commented 2 months ago

项目示例代码调用的是AI-ModelScope/Sunsimiao，应该跟千问2微调模型X-D-Lab/Sunsimiao-Qwen2-7B不一样吧？我看里面权重是完全不同的。示例代码在降级transformers之后确实能正常执行。不知道你们是如何使用新版模型的，我将示例代码中的模型修改为上述名称后，则完整信息如下： 2024-09-03 17:03:59,652 - modelscope - WARNING - Authentication has expired, please re-login with modelscope login --token "YOUR_SDK_TOKEN" if you need to access private models or datasets. Downloading: 100%|████████████████████████████████████████████████████████████████| 80.0/80.0 [00:00<00:00, 206B/s] Downloading: 100%|████████████████████████████████████████████████████████████████| 714/714 [00:00<00:00, 2.35kB/s] Downloading: 100%|████████████████████████████████████████████████████████████████| 48.0/48.0 [00:00<00:00, 125B/s] Downloading: 100%|██████████████████████████████████████████████████████████████████| 242/242 [00:00<00:00, 831B/s] Downloading: 100%|████████████████████████████████████████████████████████████| 1.59M/1.59M [00:00<00:00, 2.53MB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 1.76G/1.76G [02:32<00:00, 12.4MB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 1.74G/1.74G [02:29<00:00, 12.4MB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 1.74G/1.74G [02:34<00:00, 12.1MB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 1.74G/1.74G [02:32<00:00, 12.3MB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 1.74G/1.74G [02:27<00:00, 12.6MB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 1.74G/1.74G [02:27<00:00, 12.6MB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 1.74G/1.74G [02:53<00:00, 10.8MB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 0.99G/0.99G [01:39<00:00, 10.7MB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 1.02G/1.02G [01:25<00:00, 12.7MB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 27.1k/27.1k [00:00<00:00, 64.1kB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 9.70k/9.70k [00:00<00:00, 21.1kB/s] Downloading: 100%|████████████████████████████████████████████████████████████████| 367/367 [00:00<00:00, 1.14kB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 6.70M/6.70M [00:01<00:00, 6.26MB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 1.26k/1.26k [00:00<00:00, 4.44kB/s] Downloading: 100%|████████████████████████████████████████████████████████████| 2.65M/2.65M [00:00<00:00, 4.36MB/s] Traceback (most recent call last): File "/home/it/2024--09/sunsimiao.py", line 4, in pipe = pipeline(task=Tasks.text_generation, model='X-D-Lab/Sunsimiao-Qwen2-7B') File "/home/it/.local/lib/python3.10/site-packages/modelscope/pipelines/builder.py", line 142, in pipeline check_config(cfg) File "/home/it/.local/lib/python3.10/site-packages/modelscope/utils/config.py", line 671, in check_config check_attr(ConfigFields.pipeline) File "/home/it/.local/lib/python3.10/site-packages/modelscope/utils/config.py", line 666, in check_attr assert hasattr(cfg, attr_name), f'Attribute {attr_name} is missing from ' \ AssertionError: Attribute pipeline is missing from configuration.json.

jingnant commented 2 months ago

项目示例代码调用的是AI-ModelScope/Sunsimiao，非最新的以Qwen2为基座的Sunsimiao-7B，若体验请在README提供的链接中下载权重，同时近期Sunsimiao-7B也会提供在线体验

thomas-yanxin commented 2 months ago

u can use this code for X-D-Lab/Sunsimiao-Qwen2-7B

from modelscope import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(
    "X-D-Lab/Sunsimiao-Qwen2-7B",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("qwen/Qwen2-1.5B-Instruct")

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

yaleimeng commented 2 months ago

Thanks. This code runs correctly.

X-D-Lab / Sunsimiao

模型X-D-Lab/Sunsimiao-Qwen2-7B 推理报错 #19