hyperonym / basaran

Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
MIT License
1.29k stars 81 forks source link

concurrent request supported? #205

Closed hudengjunai closed 1 year ago

hudengjunai commented 1 year ago

Does this server support concurent request ? Is is thread-safe for concurent request?

peakji commented 1 year ago

Does this server support concurent request ?

Yes, the server supports up to SERVER_THREADS (default = 32) concurrent requests. It is recommended to set this environment variable to a number that's larger than your number of CPU threads, especially when you're using GPUs.

Is is thread-safe for concurent request?

In general, all the objects in Basaran and PyTorch are thread safe to read. But we cannot guarantee thread-safety for other native components in the dependencies.