Open sergey-komissarov opened 5 years ago
Does this bug still exists? Maybe channels needs tests which simulate cancelation of every network operation.
@HMaker I believe this bug is fixed. Last time I checked channels used priority queue based on message creation time instead of list. This must keep messages in order.
Hello, we are using group to send notifications from channels worker process to the websocket consumer in the server process. There is exactly one worker in the group and exactly one consumer on the server side. Sometimes messages order is changed on the server side.
Detailed investigation shows the following events on the server side:
specific.innxCCgU!NcMqIZztkRQJ
receive_lock
acquired andreceive_single
started.receive_lock
was released:specific.innxCCgU!EKHzfqiTYmHm
acquiredreceive_lock
and found redis lists in the following state:It seems that the redis server already moved message to the backup list during
brpoplpush
operation but the message body was not fully received when the task was cancelled.msg1
is placed to the beginning of the listasgi:specific.innxCCgU!
andmsg2
is returned by receive.msg3
and only the next will getmsg1
from the redis.The simplest way to fix it is to modify script at core.py#L316 and place messages from the backup list at the end of the channel list. But I also think that redis backup list cleanup must be done under the
receive_lock
because another call to receive may acquire lock and place message from the backup list back to the channel list before cleanup coroutine will be called.Our environment setup (docker image based on ubuntu and daphne as a server): python version: 3.7.1 installed packages: