OpenMOSS / MOSS

An open-source tool-augmented conversational language model from Fudan University
https://txsun1997.github.io/blogs/moss.html
Apache License 2.0
11.89k stars 1.15k forks source link

how to run fnlp/moss-base-7b in multi gpus #369

Open FakerYFX opened 10 months ago

FakerYFX commented 10 months ago

how to run fnlp/moss-base-7b in multi gpus, i use this way:

from transformers import AutoTokenizer, AutoModelForCausalLM os.environ['CUDA_VISIBLE_DEVICES'] = "0,1" tokenizer = AutoTokenizer.from_pretrained("fnlp/moss-base-7b", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("fnlp/moss-base-7b", trust_remote_code=True).cuda() model = model.eval() inputs = tokenizer(["流浪地球的导演是"], return_tensors="pt") for k,v in inputs.items(): inputs[k] = v.cuda() outputs = model.generate(**inputs, do_sample=True, temperature=0.8, top_p=0.8, repetition_penalty=1.1, max_new_tokens=256) response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) print(response)

however it didn't use. Could you please give me some advice?