rebus-org / Rebus.RabbitMq

:bus: RabbitMQ transport for Rebus
https://mookid.dk/category/rebus
Other
65 stars 45 forks source link

Several error messages when lots of message are to be processed #30

Closed arielmoraes closed 5 years ago

arielmoraes commented 6 years ago

In our scenario we have lots of messages being published and consumed, but when the rate of incoming and deliver is high Rebus suddenly stops to publish, deliver or ack messages (tracked by using the magement web UI) and this happens globally to all connected publishers and consumers (all queues stop). The following 3 exceptions are randomly thrown:

Could not 'GetOrAdd' item with key 'rabbitmq-current-model' as type RabbitMQ.Client.IModel ---> System.TimeoutException: The operation has timed out.

Already Closed: The AMQP operation was interrupted: AMQP close-reason, initiated by Peer, code=406, text="PRECONDITION_FAILED - unknown delivery tag 2098"

Queue throw EndOfStreamException

What could be the cause of those exceptions? Are they related to the high rate of messaging?

mookid8000 commented 6 years ago

Can you tell me some more about the situation?

How many worker threads and which degree of parallelism is configure in each endpoint?

When you say "(...) when the rate of incoming and deliver is high (...)" do you know which ballpark we're in?

mookid8000 commented 5 years ago

Hi @arielmoraes , have you found out anything more about this issue?

mookid8000 commented 5 years ago

Hi @arielmoraes , what's the status of this?

viacheslave commented 5 years ago

I'm getting into the same situation. At some rates, no really high, about 15-20 messages / second, RabbitMQ server stops accepting messages from publishers, it also blocks all other publishers and subscribers, renders unusable.

I'm getting Could not 'GetOrAdd' item with key 'rabbitmq-current-model' as type RabbitMQ.Client.IModel ---> System.TimeoutException: The operation has timed out.

What I'm noticing from UI, while that's happening:

My setup: rabbitmq server v3.7.7 I'm using .NET Rebus over .NET RabbitMQ with a pretty simple config, in synchronous mode, 15 number of workers and 15 max degree of parallelism options.

I suspect that the number of channel is the reason, but I'm not sure I can configure it from the framework. In normal node every connection uses one channel from what I can tell from UI.

mookid8000 commented 5 years ago

I've made some adjustments to how Rebus' RabbitMQ transport manages RabbitMQ channels.

I think it solves a big bunch of problems, both performance not being as great as one could wish for, and these concurrency issues. 🚀

It's out as Rebus.RabbitMq 5.0.0-b04. It would be awesome if you would try it out and report your experiences back here! 👍

mookid8000 commented 5 years ago

Hi @arielmoraes and @viacheslave , have any of you had the chance to try 5.0.0-b04 ?

mookid8000 commented 5 years ago

I've had positive responses from other places regarding this version, so I'm closing this issue for now. Feel free to come back here, if the problem persists somehow.