jonashaag / bjoern

A screamingly fast Python 2/3 WSGI server written in C.
Other
3k stars 189 forks source link

[q] A simple quick way to detect bjoern-friendly module #194

Closed ak4nv closed 3 years ago

ak4nv commented 3 years ago

Hi all!

Bjoern is single threaded and evented, so if you Python application yields on a regular basis you may be okay with running only a single process.

from here.

I developed one application where I used bjoern to run on the production. Later I needed to add a messaging queue. I tried to figure out which library is bjoern-friendly or not. pika, amqp, kombu It seems I haven't found anything.

Probably some database drivers are not good for the subject. Who knows a way to detect is it good or not?

Maybe a bjoern-friendly list of libraries creation is a good idea... :thinking:

antoncom commented 3 years ago

Hello!

I'm also interested in the subject. I used bjoern for one pet project where Falcon used for REST API, Tokyo Tyrant as key-value database. The reason to use bjoern/falcon/tyrant is to minimize server resources (simple VPS is used now) in order to create graph database of juridical entities extracted from documents. After my pet project had started I made high-load testing: 10k requests per minute stay the website live.

It would be nice to move forward in the matter of using bjoern in high-load projects. If someone has experience, please share! I used this way: https://alexpnt.github.io/2018/01/06/fast-inference-falcon-bjoern/

jonashaag commented 3 years ago

I used this way

Interesting read! I doubt Gunicorn is that slow though... are you sure the benchmarks were comparable?

Maybe a bjoern-friendly list of libraries creation is a good idea...

Depends on what you mean by “friendly”. For a single bjoern instance to be concurrent the code will have to literally Python-yield every now and then, or be implemented in some other way that uses the Python sequence protocol and computes parts of the response not before it is asked to. Otherwise bjoern, being single threaded, will wait for the entire response to be computed and other clients have to wait.

I guess it’s almost impossible to find any libraries that do this.

On the other hand, spawning lots of bjoern instances using receive steering is not going to give you a memory problem, so that is a nice way to have more concurrency if the rest of your system and I/O can take it. If your applications spend most of their time in I/O and you can perform much more I/O (proportionally) than you can spawn bjoern instances then bjoern is not the right server and you’ll have to use a properly async server with all its advantages and disadvantages (spaghetti stacks, lack of back pressure, ...). Example of that case would be that all your databases live on other machines, your queries and responses are quick to send and receive but the results take a long time to compute, AND the external databases are much more scalable than your web server machine so you will not overwhelm them with your concurrent requests.

antoncom commented 3 years ago

Interesting read! I doubt Gunicorn is that slow though... are you sure the benchmarks were comparable?

I'm not sure. That was an article of a man who is more experienced than me. I believed him and that's why I took bjoern ). Anyway, it was successfull for my project (before that I used uWSGI with small luck), and I thank you for Bjoern!

ak4nv commented 3 years ago

For a single bjoern instance to be concurrent the code will have to literally Python-yield every now and then or be implemented in some other way that uses the Python sequence protocol and computes parts of the response not before it is asked to.

Exactly! Just a list of popular I/O libraries (database drivers, http/amqp/etc... protocols) which is good or not.

jonashaag commented 3 years ago

My point is that it probably doesn't make sense to select your libraries like that. If you really need this very specific behaviour, you probably already have 100s of servers running a very specific workload and maybe then you're better off building your own I/O library for your very specific use case. I assume you don't have 100s of servers running bjoern; can you tell what's your use case? What problem are you trying to optimize specifically?

ak4nv commented 3 years ago

My case is very simple and occurs quite often. For each request, I need to send a message to the rabbitmq. If I don't find a friendly library I switch to gunicorn+gevent.

jonashaag commented 3 years ago

In this case I recommend to just spawn N bjoern instances, benchmark throughput, then increase N until RabbitMQ becomes the bottleneck :)

jonashaag commented 3 years ago

Closing due to inactivity