LC1332 / Chat-Haruhi-Suzumiya

Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.
Apache License 2.0
1.85k stars 164 forks source link

使用vllm数据并行和ChatHaruhi一起使用会报RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #83

Open 545771889a opened 3 weeks ago

545771889a commented 3 weeks ago

我的代码 from vllm import LLM, SamplingParams from chatharuhi import ChatHaruhi (这里只要导入ChatHaruhi就会报Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method)

def loadmodel(model_name, peft_model, quantization=None, use_fast_kernels=True, seed=42, **kwargs):

加载model、tokenizer、rag

llm = LLM(model=model_name, max_model_len=40452, tensor_parallel_size=2) #这里只有tensor_parallel_size设置为1才能正常使用
torch.cuda.manual_seed(seed)
torch.manual_seed(seed)

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token

# rag
chatbot = ChatHaruhi(role_name='Sheldon', max_len_story=1000)
return llm, tokenizer, chatbot
LC1332 commented 3 weeks ago

可能内部启rag的vector模型的时候发生冲突了-o-