Closed seppo0010 closed 8 years ago
I was able to reproduce consistently. It fails after multiple ackjob
are issued for the same job.
127.0.0.1:7711> addjob queue1 job1 0
DId1dbbca16b854c7ec98332a4019bcbf143791c3f05a0SQ
127.0.0.1:7711> ackjob DId1dbbca16b854c7ec98332a4019bcbf143791c3f05a0SQ
(integer) 1
127.0.0.1:7711> jscan 0 queue queue1
1) "0"
2) (empty list or set)
127.0.0.1:7711> ackjob DId1dbbca16b854c7ec98332a4019bcbf143791c3f05a0SQ
(integer) 0
127.0.0.1:7711> jscan 0 queue queue1
Could not connect to Disque at 127.0.0.1:7711: Connection refused
Hi @seppo0010 , there is a path may cause this. :-) When there is no job of that ID(already ack and delete in your case), ackjobCommand create a dummy job, but the job is created with its queue as NULL. ack.c-L216-217 job.c-L178 Then job.c-L1416 use the NULL(job->queue) in equalStringObjects, so there is the segfault.
Try to fix it in https://github.com/antirez/disque/pull/114
@seppo0010 Thanks to your clear statement, this is solved quickly :-)
Just confirming that I encountered the same error in my tests. It only happens when using the JSCAN command after sending a duplicate ACK.
Thanks, going to merge @sunheehnus fix between today and tomorrow. Tomorrow I'll work all day to Disque btw so I'll check all the other issues. Thanks.
p.s. would be great to get ACKs about @sunheehnus fix being enough to stop the problem in @Koed00 and @seppo0010 tests.
Yes, I guess this situation is more likely to happen in a one-node cluster :-)
I have some time to run the pull request against my tests. Will report back.
Thanks!
Works great. Tests are passing again. Carry on.
:fireworks:
Got the following crash while creating a client library. The commands being executed related to JSCAN were:
No other JSCAN parameters were used. I did not run a full memory check yet.