Open NasonZ opened 12 months ago
Downgrading transformers fixed this for me, pip install --upgrade transformers==4.33
Just for any future folks stumbling upon this issue (as did I);
pip install --upgrade transformers==4.33
works as expected but does bring the outdated transformers with it (pretty obvious).
I ended up just upgrading transformers to latest and using the ctransformers package for the AutoModelForCausalLM
and using the transformers package for the AutoTokenizer
Although it works, it pops up the following error when I run:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA_gather)
Here is my code:
model = AutoModelForCausalLM.from_pretrained(model_name, model_file="tinyllama-1.1b-chat-v1.0.Q5_K_M.gguf",
model_type="llama", hf=True)
tokenizer = AutoTokenizer.from_pretrained(model)
terminators = [tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|assistant|>")]
text_generation_pipeline = pipeline(
model=model, device="cuda",
tokenizer=tokenizer,
task="text-generation",
temperature=0.2,
do_sample=True,
repetition_penalty=1.1,
return_full_text=False,
max_new_tokens=256,
eos_token_id = terminators,
)
llm = HuggingFacePipeline(pipeline=text_generation_pipeline)
llm_chain = LLMChain(llm=llm, prompt=chat_prompt, llm_kwargs={'device': "cuda"})
lm_chain.invoke(input={"user_input" : user_input, "history" : memory.chat_memory.messages},
stop=["<|user|>", "Human:"])['text']
env:
transformers ==4.35.2 ctransformers==0.2.27+cu121
gives output:
Seems to be a similar issue to #154.
Any suggestions on how to resolve this would be appreciated.