rabbitmq / rabbitmq-dotnet-client

RabbitMQ .NET client for .NET Standard 2.0+ and .NET 4.6.2+
https://www.rabbitmq.com/dotnet.html
Other
2.07k stars 579 forks source link

Calling channel.CloseAsync from within AsyncEventingBasicConsumer handler causes deadlock #1567

Closed mogzol closed 4 months ago

mogzol commented 4 months ago

Describe the bug

I am using the 7.0.0-alpha.5 release of the RabbitMQ library. In my code, under certain scenarios, the channel will be closed from within the Recieved handler of an AsyncEventingBasicConsumer. Trying to do this, however, causes a deadlock. This worked fine in v6 of this library.

Reproduction steps

Here is a simple program which reproduces the issue:

using System.Text;
using RabbitMQ.Client;
using RabbitMQ.Client.Events;

internal class Program
{
  private static async Task Main(string[] args)
  {
    var connection = await new ConnectionFactory() { DispatchConsumersAsync = true }.CreateConnectionAsync();
    var channel = await connection.CreateChannelAsync();

    await channel.QueueDeclareAsync("testing");
    await channel.QueueBindAsync("testing", "amq.topic", "testing");

    var cancel = new CancellationTokenSource();

    var consumer = new AsyncEventingBasicConsumer(channel);
    consumer.Received += async (_, eventArgs) =>
    {
      Console.WriteLine("Received message");
      await channel.BasicCancelAsync(eventArgs.ConsumerTag);
      Console.WriteLine("Cancelled consumer");
      await channel.CloseAsync();
      Console.WriteLine("Closed channel");
      channel.Dispose();
      Console.WriteLine("Disposed channel");
      cancel.Cancel();
    };

    await channel.BasicConsumeAsync(consumer, "testing", true);

    await channel.BasicPublishAsync("amq.topic", "testing", new BasicProperties(), Encoding.UTF8.GetBytes("Hello, World!"));
    Console.WriteLine("Published message");

    try { await Task.Delay(Timeout.Infinite, cancel.Token); } catch { };
  }
}

Running this, the app will hang before "Closed channel" is logged. If you try this same code with v6 of the library (without the new Async calls) then all messaged are logged and the app exits.

Expected behavior

Closing a channel from within a consumer should not deadlock.

Additional context

No response

lukebakken commented 4 months ago

Thanks for the report and the code. I'll take a look to see if a fix is reasonably easy, but this sort of operation is discouraged.

I'll probably make closing and disposing a channel from within an event handler explicitly disallowed in version 7 of the library.

lukebakken commented 4 months ago

Thanks for testing version 7, we really appreciate it.

I added a test in #1568 to show the correct way to cancel a consumer. Do NOT close the channel from within the callback.

It's not immediately obvious why the deadlock happens, and I may not be able to figure it out for version 7. Either I will fix it, or calling CloseAsync / Dispose from an event handle will throw InvalidOperationException.

michaelklishin commented 4 months ago

@lukebakken FWIW we had a similar problem almost a decade ago in the Java client. I could not find a way to tell if the channel was closed from within a handler, so we had to handle it one way or another, even though I agree that using a "self-terminating" async consumer doesn't make much sense.

IIRC in the Java client it was necessary or at least seemed so in scenarios where an application has to shut down after receiving a signal.

mogzol commented 4 months ago

Thanks. My actual code where this happens is more complex, I'm setting a callback which closes the channel, which is normally called from outside the handler, but there's one scenario where it ends up being called from within the handler, causing this issue. Luckily at least in my case I don't actually need to await the channel closing, so I can work around it by closing the channel without an await, like:

_ = channel.CloseAsync().ContinueWith((_) => channel.Dispose());
lukebakken commented 4 months ago

@mogzol thanks for that code, it is a good workaround for this issue.

I couldn't find a "simple" way to block this operation using the current codebase. For version 8 we may be able to fix this issue or block it (https://github.com/rabbitmq/rabbitmq-dotnet-client/issues/1419)