ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
32.94k stars 5.58k forks source link

Ray Serve with Fastapi is a lot (10x) slower than plain fastapi #46693

Closed githuberj closed 1 month ago

githuberj commented 1 month ago

What happened + What you expected to happen

I have an issue or maybe just a wrong expectation while working with ray serve . I thought it might be able to run our whole fastapi application but it looks like it might be 10x slower that a normal fastapi application. Therefore I did this small benchmark. Maybe the benchmark is too simplistic. Could you provide a more comprehensive example then? I would just like to have our expectations set: Is it a bad idea to put general application logic into Ray?

Versions / Dependencies

ray[serve] == 2.32.0 fastapi == 0.111.1

Reproduction script

I used docker to test it and provided a zip file. benchmark.zip

Issue Severity

Low: It annoys or frustrates me.

Superskyyy commented 1 month ago

Hello, see this related issue, but in general Ray serve is not expected to deliver high qps like normal FastAPI as it has a single point of ingress (proxy actor) that brings additional overhead, but I expect the overhead can be lowered in the future.

githuberj commented 1 month ago

Thank you for the response. I will close the issue.