litestar-org / api-performance-tests

Benchmarking Litestar vs other ASGI API framework
https://litestar.dev/
39 stars 6 forks source link

Enhancement: container cpu lmit #26

Closed raphaelauv closed 1 year ago

raphaelauv commented 1 year ago

Summary

would be more fair to set the same cpu lmit for exery framework ( like we do when we deploy in K8S)

by adding nano_cpus to the docker run

I changed this line

return self.docker_client.containers.run(image=image, ports={SERVER_PORT: SERVER_PORT}, detach=True ,nano_cpus=1000000000)

and the test : bench run --rps fastapi litestar --endpoint-mode sync --endpoint-category plaintext

gave me

Screenshot from 2023-08-14 02-51-18

provinzkraut commented 1 year ago

would be more fair to set the same cpu lmit for exery framework

We already do that. Everything is run with uvicorn, using 1 CPU core. I don't want to get into multi-worker scenarios as it makes things much more complex and hard to get right so we still have a level playing field for every framework.

You can read more about this here but when the benchmarks are run, they are pinned to a single, shielded CPU core, so they won't be affected by anything else.

raphaelauv commented 1 year ago

running 1 uvicorn worker does not limit the process to using only 1 cpu core ( you can start task in another process from the first process )

I do not see they are pinned to a single, shielded CPU core in the code , could you show me where it is ? thanks

Goldziher commented 1 year ago

you can start task in another process from the first process

Well, you could import multiprocessing and do this, but we obviously do not do this 😉

I do not see they are pinned to a single, shielded CPU core in the code , could you show me where it is ? thanks

That is exactly what is happening when you are running a single uvicorn process with the docker container. The tests are executed sequentially for each framework, each running on a single CPU with a uvicorn worker, and we do not run any multprocessing code anywhere else.

provinzkraut commented 1 year ago

running 1 uvicorn worker does not limit the process to using only 1 cpu core ( you can start task in another process from the first process )

That is true, but that doesn't mean it's not a fair comparison. If e.g. Sanic would find a way to speed up their requests by offloading something to a separate process then that's totally fair. A level playing field does not mean to take away all the advantages a competitor may have, it just means making sure that all have the same conditions, which is the case here. All tests get run with the same uvicorn setup, the same number of workers, on the same machine, restricted to a single, shielded CPU core.

I do not see they are pinned to a single, shielded CPU core in the code , could you show me where it is ? thanks

Unfortunately it's not in the code because I couldn't figure out a good way to make this portable. As I described above it's using the cset shield command to achieve this. I'm definitely open to suggestions how we can apply this by default.

raphaelauv commented 1 year ago

we apparently disagree on the meaning of "fair"

then it would be more precise to do a benchmark comparison with multiple vcpu limit ( 1 , 4 , None ) so we can appreciate the differences

provinzkraut commented 1 year ago

then it would be more precise to do a benchmark comparison with multiple vcpu limit ( 1 , 4 , None ) so we can appreciate the differences

I don't expect much differences because as far as I am aware none of the tested frameworks uses any sort of multiprocessing.