How to start the ray server with serve.py script in CLI?

I got it.

Make sure you have this in your server.py at the very end...

deployment = RayAssistantDeployment.bind(parameter1=value1, parameter2=value2, ...)
serve.run(deployment, route_prefix="/")

Open CLI where your server.py is

# Start
>> ray start --head
>> python serve.py

# To stop
>> ray stop

Head over to http://127.0.0.1:8265/ to see the Ray dashboard.

Here is the API call for /predict

curl --location 'http://127.0.0.1:8000/query' \
--header 'Content-Type: application/json' \
--data '{
    "query": "What is going on?"
}'

ray-project / llm-applications