morlandi / sinewave

Real-time data acquisition, from Arduino to the web: My speech at PyCon UK 2019 and PyCon X Italia 2019
7 stars 5 forks source link

Question about performances #6

Open Jacks349 opened 3 years ago

Jacks349 commented 3 years ago

I've been using this method for a while and it works flawlessly, this approach is very powerful and underrated. I'm doing the following: some python applications are generating data on my backend, that data is sent to Django using Redis PUB SUB -> Django receives the data -> Django publishes that data to connected user/clients through Django channels.

Now, i was thinking of what limits can this system have. In particular, i was thinking of scaling up my whole project and handle a lot more messages per second.

Now i know this is a naive question, but is there a limit to the messages per second i can send through this system? Or is the limit my hardware? In a scenario where i'm sending some thousands messages per second, what could i do to "relieve" my system? For example: instead of having only one listener, have more of them connected to more Redis channgels.

Note that in my case the client is not necessarily a browser, but even someone listening to my Django Channels app from an application, or code. Also, the messages i send are very small JSON strings.

Thank you for your work!

morlandi commented 3 years ago

Thank you @Jacks349 for your interest in this tutorial project. Your question is really interesting.

Splitting messages over different Pub/Sub channels and having multiple listeners shouldn't help here; at least, this is what I understand from this discussion:

https://groups.google.com/g/redis-db/c/R09u__3Jzfk?pli=1

where Salvatore Sanfilippo shares some interesting considerations about Pub/Sub performances in Redis.

Redis is very fast, and my guess is that Websocket should be the real bottleneck, much before Pub/Sub. I never did any benchmarck for this, however.

I'ld really like to collect some real data for measuring the message rate limit. An interesting approach for this is provided here:

A Tale of Two Queues

http://blog.jupo.org/2013/02/23/a-tale-of-two-queues/

I'll leave this Issue open for suggestions; thank you for pointing out this topic.

Jacks349 commented 3 years ago

Thank you a lot! This is very interesting. I agree that the bottleneck here might be the Django Channels side. I have been using this system and it works flawlessly, that's why i want to see what happens if i scale it up to a thousand or more messages per second, and i get that it's a very ambitious goal. A trick that i used to avoid any problem was to set, on Django Channels, a big capacity for Redis with a very low expiration time, capacity was 1500 with an expiration time of 1 second, which is ok because Django Channels must only deliver the message to client/users and it doesn't need to store anything:

CHANNEL_LAYERS = {
    'default': {
        'BACKEND': 'channels_redis.core.RedisChannelLayer',
        'CONFIG': {
            "hosts": [('127.0.0.1', 6379)],
            'capacity': 1500,
            'expiry': 1,
        },
    },
}