baichuan-inc / Baichuan2

A series of large language models developed by Baichuan Intelligent Technology
https://huggingface.co/baichuan-inc
Apache License 2.0
4.08k stars 293 forks source link

求助:A10的推理速度比3090慢一倍 #339

Open aiaiyueq11 opened 9 months ago

aiaiyueq11 commented 9 months ago

baichuan2-7B微调后在A10上的推理速度比3090慢1倍左右,请问有遇到类似情况的吗?? 模型加载: model = AutoModelForCausalLM.from_pretrained( self.config['model_name_or_path'], device_map="auto", trust_remote_code=True, torch_dtype=torch.bfloat16 )

推理参数: model.generate(**inputs, do_sample=False, num_beams=2, max_new_tokens=64, repetition_penalty=1.1)

输出50长度的token,3090要1.8s,A10要4.8s