OrionStarAI / Orion

Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. Orion-14B 系列模型包括一个具有140亿参数的多语言基座大模型以及一系列相关的衍生模型,包括对话模型,长文本模型,量化模型,RAG微调模型,Agent微调模型等。
Apache License 2.0
781 stars 56 forks source link

Orion-14B-Chat-Int4 chat error #29

Open ifromeast opened 7 months ago

ifromeast commented 7 months ago

Hi, when i run Orion-14B-Chat-Int4, by following code on A800-80G

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

model_name = "OrionStarAI/Orion-14B-Chat-Int4"
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype=torch.bfloat16,
                                             trust_remote_code=True, use_flash_attention_2=True)

model.generation_config = GenerationConfig.from_pretrained(model_name)

import time

query = '世界第二高峰是哪个'
messages = [{"role": "user", "content": query}]
response = model.chat(tokenizer, messages, streaming=False)
print(response)

but met the ERROR that

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

and the following is the environment

transformers              4.36.2                   pypi_0    pypi
torch                     2.1.2+cu118              pypi_0    pypi
flash-attn                2.5.0                    pypi_0    pypi
accelerate                0.26.1                   pypi_0    pypi

Is there anything wrong with me ?

YIZXIY commented 7 months ago

感觉你是像我一样发现了int4版本没有pytorch_model.bin.index.json,所以就把Orion-14B-Chat的pytorch_model.bin.index.json魔改失败的样子,我和你的报错一样,但是正常应该报找不到模型的错误