ray-project / llm-applications

A comprehensive guide to building RAG-based LLM applications for production.
Creative Commons Attribution 4.0 International
1.71k stars 229 forks source link

How to start the ray server with serve.py script in CLI? #81

Closed spate141 closed 11 months ago

spate141 commented 11 months ago

I got it.

Make sure you have this in your server.py at the very end...

deployment = RayAssistantDeployment.bind(parameter1=value1, parameter2=value2, ...)
serve.run(deployment, route_prefix="/")

Open CLI where your server.py is

# Start
>> ray start --head
>> python serve.py

# To stop
>> ray stop

Head over to http://127.0.0.1:8265/ to see the Ray dashboard.

Here is the API call for /predict

curl --location 'http://127.0.0.1:8000/query' \
--header 'Content-Type: application/json' \
--data '{
    "query": "What is going on?"
}'