Polyconseil / aioamqp

AMQP implementation using asyncio
Other
280 stars 88 forks source link

Fix waiter already exists issue #159

Closed nhumrich closed 2 years ago

nhumrich commented 6 years ago

Fixes #105

The problem with #105 is that the waiter never gets cleared out of the waiters list. So next time a consume is called, the waiter is sitting there causing a problem. This makes sure the waiter is removed correctly before returning the result.

notmeta commented 5 years ago

Bumping this, run into a problem which this PR will fix

dzen commented 5 years ago

I'm not comfortable with this PR since there is no test

nhumrich commented 5 years ago

Man, its been a year since I submitted this PR. At this time, I have just monkey patched this code into all of my projects. I have no recollection of how this bug surfaces itself. Anyone have example code of how to cause the error so I can write a test? I just have idea how to test this currently.

@dzen I understand your want for a test. I am not sure if it helps your comfort level at all, but I have had this monkey patched in my production services for over a year.

dzen commented 5 years ago

@nhumrich Is it possible to provide a way to reproduce ?

nhumrich commented 5 years ago

Thats my quesion in my previous comment:

Anyone have example code of how to cause the error so I can write a test?

It's been so long since I ran into this, (i now have this fix monkey patched into my stuff), I can't remember how to reproduce it. Maybe @notmeta can provide some input?

notmeta commented 5 years ago

I've since changed my implementation to prevent this from happening, however I was receiving this error by attempting to call basic_consume on a channel with an existing consumer (for reconnection purposes due to the previous basic_consume failing)

I worked around the issue by simply reconnecting (creating a new connection and channel entirely) rather than reusing the old one.

dzen commented 5 years ago

So we do need a waiter per consumer-tag. Imho this is the real fix. Note: the cancel method must remove them too:

This method asks the server to start a "consumer", which is a transient request for messages from a specific queue. Consumers last as long as the channel they were declared on, or until the client cancels them.

nhumrich commented 5 years ago

@dzen Ok, so I finally found out the real issue here. Its a race condition with how the waiters work. The current implementation only allows one of every action to happen at a time. All the tests are written in a sequential way, so this was never obvious. I added a test that does asyncio.gather() on every action, and every action causes a waiter issue.

This fix is probably not the best fix, and the whole implementation probably needs to be rethought. I did however push some fixes that at least make it work for this test and similar use cases. I didnt fix everything, so there will still be waiter issues for other methods and stuff. I am totally ok if you dont accept this PR in its current state. But I at least wanted to get the ball rolling in finding the right solution. Any ideas? Should we push this change for now, then come up with a better way to handle things?

nhumrich commented 5 years ago

bump

dzen commented 5 years ago

Hello Nick,

I was off this month, I will take care of this I the next few days.

Le lun. 1 avr. 2019 à 21:19, Nick Humrich notifications@github.com a écrit :

bump

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/Polyconseil/aioamqp/pull/159#issuecomment-478710344, or mute the thread https://github.com/notifications/unsubscribe-auth/AADEQXc2aUT9X8oNDaoX81PHk4EB2tjJks5vclvWgaJpZM4SOFpU .

-- Benoît CALVEZ Polyconseil | 26 rue de Berri | 75008 Paris

dzen commented 5 years ago

Hi @nhumrich.

I will merge this this week.

Thanks!

RemiCardona commented 2 years ago

PR pushed in commit acd871ac7f930892ad5994e07789b83d0cd4512b before 0.14.0 (november 2019), closing.