OpenLMLab / MOSS-RLHF

MOSS-RLHF
Apache License 2.0
1.21k stars 93 forks source link

Inference with SFT and Policy EN models #36

Open henrypapadatos opened 7 months ago

henrypapadatos commented 7 months ago

Hello, I am trying to do some basic inference with your sft and policy models. However, when I instanciate the model directly with LlamaForCausalLM, the generation works well for the base pretrained LLama. But the sft model outputs nothing and the policy model outputs random tokens.

Could you help me with that? :) Thanks in advance!

from transformers import AutoTokenizer, LlamaForCausalLM

# base Llama 1
model_name_or_path1 = 'baffo32/decapoda-research-llama-7B-hf'
tokenizer_name_or_path = '/nas/ucb/henrypapadatos/MOSS-RLHF/models/moss-rlhf-policy-model-7B-en'
model1 = LlamaForCausalLM.from_pretrained(model_name_or_path1,device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name_or_path, padding_side='left')

prompt = "Hey, are you conscious? Can you talk to me?"
inputs = tokenizer(prompt, return_tensors="pt").to(device=cuda)

generate_ids = model1.generate(inputs.input_ids, max_length=50)
output = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
print(output)

Output: Hey, are you conscious? Can you talk to me? I'm not sure if you're conscious, but I'm going to assume you are. I'm not sure if you're conscious, but I'

#sft model 
model_name_or_path2 = '/nas/ucb/henrypapadatos/MOSS-RLHF/models/moss-rlhf-sft-model-7B-en/recover'
model2 = LlamaForCausalLM.from_pretrained(model_name_or_path2,device_map="auto")

generate_ids = model2.generate(inputs.input_ids, max_length=(50))
output = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
print(output)

Output: Hey, are you conscious? Can you talk to me?

#policy model 
model_name_or_path3 = '/nas/ucb/henrypapadatos/MOSS-RLHF/models/moss-rlhf-policy-model-7B-en/recover'
model3 = LlamaForCausalLM.from_pretrained(model_name_or_path3,device_map="auto")

generate_ids = model3.generate(inputs.input_ids, max_length=50)
output = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
print(output)

Output: Hey, are you conscious? Can you talk to me?lapsedmodниципамина� deploymentclassesандфикаouses compat thereforezzachn乡 Hope WilliamHER forms problemunicí filmewissenschaft scopeASHERTстыunderline instrumentsполиAnalItalie essentialRegisterкраї traverse автор

Ablustrund commented 6 months ago

hi! Thanks for your attention! Try to add the assistant prompt and the bot prompt, i.e., Human: Hey, are you conscious? Can you talk to me? User: