Open lukebakken opened 4 months ago
In the Java client, the behavior of a Connection#createChannel
call is not to return unless a channel.open-ok
frame is received. When the connection is blocked by a resource alarm, that will not happen until the operation times out or the alarm clears. So I guess that avoids the leak.
In Bunny, in addition to the behavior above, there's a predicate on Bunny::Connection
that returns true
when the connection is blocked.
If the Java client option would be too difficult or impractical to pull off, the Bunny solution plus throwing an exception might be a good enough approximation.
Thanks @michaelklishin. I moved this to an issue to try to reproduce it.
I was the original author of the discussion post. I did try this again recently and the behaviour still seems to be there.
Hi @neilgreatorex, thanks for following up.
I took you code and added it to a test to demonstrate that this issue is fixed in the latest code - https://github.com/rabbitmq/rabbitmq-dotnet-client/pull/1587
Note that the TimeoutException
when creating channels on a blocked connection does not happen anymore. I'm not sure why to be honest but it's good enough for me.
Hi @lukebakken. Unfortunately I can still reproduce the issue using the code in master. I have shared my test class at https://gist.github.com/neilgreatorex/63e163c933ca8b2836b987eeac85a7b4.
To test:
rabbitmqctl set_vm_memory_high_watermark absolute 10
).rabbitmqctl list_channels
) that the client has no reference to.@neilgreatorex do you get TimeoutException
thrown when you create the five channels in your env?
@neilgreatorex do you get
TimeoutException
thrown when you create the five channels in your env?
@lukebakken I get TaskCancelledExceptions now with the async methods
Thanks for the updated code, I had a bug in my test. I can reproduce this now.
Here is what is happening - when those CreateChannelAsync
methods time out (due to not receiving a response from RabbitMQ), they should NOT be considered valid and added to the channelTasks
array, even though the channels are considered open in RabbitMQ.
When the connection is unblocked, RabbitMQ sends the channel.open-ok
response, but the task continuation already timed out (TaskCanceledException
), so the client library can't do anything with them. The channels are "zombies" within RabbitMQ that will be closed when the connection closes.
Maaybe the next version of this library can keep track of "stale, timed-out" continuations for some time (perhaps when the connection is blocked) to see if a response is sent back, but version 7 will not do that.
cc @michaelklishin @Gsantomaggio I thought you'd find this case interesting.
@lukebakken Would it be possible that if the library received a channel.open-ok
for a channel ID that it wasn't expecting (i.e. has no continuation for), it could immediately close the channel? This is just a suggestion without knowledge of how it all works and maybe it's not helpful. I don't even know if the library can tell if the channel ID is valid or not 😃
@neilgreatorex that's a good idea.
Discussed in https://github.com/rabbitmq/rabbitmq-dotnet-client/discussions/1131