hyperonym / basaran

Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
MIT License
1.29k stars 81 forks source link

Loading basaran on multiple gpus leads to error #280

Open tanmaylaud opened 7 months ago

tanmaylaud commented 7 months ago

Getting error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!

I am running basaran with default params and llama 2 model.