Closed averkij closed 1 year ago
ReLLM is only a wrapper around calling the huggingface transformers PreTrainedModel. It only does some work around filtering logits and calling the model.generate method.
Do your model.generate workout outside the complete_re call?
Hi, thanks for your work.
When I'm trying to do generation on GPU I get the following error inside logits_processor.py in transformers (I've tried to place tensors on cuda inside compile_re).
How to use rellm on gpu?