moccu / django-omnibus

django-omnibus is a Django library which helps to create websocket-based connections between a browser and a server to deliver messages.
BSD 3-Clause "New" or "Revised" License
70 stars 14 forks source link

very first connection devours message #26

Open weltoph opened 9 years ago

weltoph commented 9 years ago

Using omnibus we encountered a problem: Basically, we try to implement a chat between two peers. Although the server is able to send to the clients (tested) and the message from the client reaches the server (tested), the very first message after a restart of omnibusd never reaches the other client. After having lost this message, everything is fine. Is there a solution or can anyone confirm that this is a thing?

synthead commented 9 years ago

I'm experiencing something similar. I have a view that kicks off a Celery task, which saves a model if successful. There's a receiver set on that model to call publish(), and if I restart Celery, I reliably lose the first four WebSocket messages. It doesn't matter if I restart it and try immediately or wait a couple minutes; for some reason, it's always four messages.

Here's the repository's commit that has the issue. This is with Django 1.7.3, Celery 3.1.17, and Omnibus 0.2.0 (the latest stable versions as of this writing). Getting this environment up to test it is slightly involved, so here's a video of the related code, the console messages, and what's seen in the browser as a video.

If there's anything I can test to help aid your development, please let me know. I'm more than willing to help.

EnTeQuAk commented 9 years ago

This unfortunately is a known issue. The problem here is that we need to write something like a monitoring-daemon that is monitoring the 0mq connection between peers. It actually takes quite a white to establish a 0mq connection and during this first few moments (unspecified how long it takes) messages can be lost. Although it's usually only one message as the whole system kind of magically establishes once one message was send. This is a shortcoming of the publish-subscriber pattern we are using in django-omnibus.

Quote:

There is one more important thing to know about PUB-SUB sockets: you do not know precisely when a subscriber starts to get messages. Even if you start a subscriber, wait a while, and then start the publisher, the subscriber will always miss the first messages that the publisher sends. This is because as the subscriber connects to the publisher (something that takes a small but non-zero time), the publisher may already be sending messages out.

http://zguide.zeromq.org/page:all

There's some work ongoing to fix this but it's still a long way. I'm very open to patches and ideas!