关于指定device的问题

feifeibear / LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding

Apache License 2.0

530 stars 51 forks source link

关于指定device的问题 #21

Closed pendulum445 closed 1 year ago

pendulum445 commented 1 year ago

https://github.com/feifeibear/LLMSpeculativeSampling/blob/1da363e9d2201663577aa2d90074853e5fda7812/main.py#L82

加载模型这里是否应该加上.to(torch_device) ？不加的话会报错：RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cuda:1! (when checking argument for argument index in method wrapper_CUDA__index_select)

taoxunqiang commented 10 months ago

这个问题解决了吗

pendulum445 commented 10 months ago

这个问题解决了吗把device_map="auto",删掉，再加上.to(device)应该可以解决，这个我很久没看了，记不太清

taoxunqiang commented 10 months ago

嗯，这样处理可以运行，不过好像没法使用多GPU推理了。