r2d4 / rellm

Exact structure out of any language model completion.
MIT License
501 stars 23 forks source link

GPU inference not working #1

Closed averkij closed 1 year ago

averkij commented 1 year ago

Hi, thanks for your work.

When I'm trying to do generation on GPU I get the following error inside logits_processor.py in transformers (I've tried to place tensors on cuda inside compile_re).

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

How to use rellm on gpu?

r2d4 commented 1 year ago

ReLLM is only a wrapper around calling the huggingface transformers PreTrainedModel. It only does some work around filtering logits and calling the model.generate method.

Do your model.generate workout outside the complete_re call?