ray-project / ray-llm

RayLLM - LLMs on Ray
https://aviary.anyscale.com
Apache License 2.0
1.2k stars 87 forks source link

Queue-Worker System #123

Open AIApprentice101 opened 5 months ago

AIApprentice101 commented 5 months ago

Thank you for the great package. I'm interested in hosting an LLM on GKE.

For our existing ML applications, we usually implement a queue-worker system (e.g. redis-queue or redis-celery) to handle long-running background tasks. Does ray-llm have a similar feature implemented under-the-hood? Or do I need to set it up myself?

sihanwang41 commented 5 months ago

Hi @AIApprentice101, We don't have the functionality in the ray-llm, you have to set it up by yourself.

For redis solution, do you see any issue or pain points? Or it is more about the integration effort.

AIApprentice101 commented 5 months ago

@sihanwang41 Thank you for your reply. I saw there's a RFC related to integration of queuing system in Ray serve: https://github.com/ray-project/ray/issues/32292. So I was wondering if that's something Ray-LLM would consider, especially given the inference of LLM usually takes pretty long to run.

In the meantime, we can set up the queuing system ourselves.