onebeyond / rascal

A config driven wrapper for amqp.node supporting multi-host connections, automatic error recovery, redelivery flood protection, transparent encryption / decryption and channel pooling.
MIT License
451 stars 69 forks source link

No channels left to allocate #225

Closed SaiKrishnaMaddu closed 11 months ago

SaiKrishnaMaddu commented 11 months ago

In our microservices architecture, two services communicate through RabbitMQ queues. However, during the creation of new queues and the publishing of messages, we encounter the error "no channels left to allocate."

Queues are created dynamically whenever a onboard queue request comes to our micro service.

How can be the channels be freed are disconnected, and is there any way to get the no channels which are in active state?

and we facing an intermittent issue while publishing the message

Error: PublicationSession.:94 - EXCEPTION Error while publishing message: ResourceRequest timed out 2023-12-18T09:45:00.397702337Z Call Stack: 2023-12-18T09:45:00.397711397Z at PublicationSession. (/usr/src/app/cb-integration-common/src/queue-manager.js:94:16) 2023-12-18T09:45:00.397717917Z at PublicationSession.emit (node:events:517:28) 2023-12-18T09:45:00.397723937Z at /usr/src/app/node_modules/rascal/lib/amqp/Publication.js:96:31 2023-12-18T09:45:00.397729487Z at Object.callback (/usr/src/app/node_modules/rascal/lib/amqp/Vhost.js:279:25) 2023-12-18T09:45:00.397734687Z at /usr/src/app/node_modules/async/dist/async.js:1559:26 2023-12-18T09:45:00.397740217Z at /usr/src/app/node_modules/async/dist/async.js:329:20

cressie176 commented 11 months ago

Hi @SaiKrishnaMaddu,

By default Rascal limits the number of channels available to 100 per connection. This limit is shared by both publications and subscriptions, however there is a difference between how these two things obtain channels.

Publications use a channel pool, configured with a minimum and maximum number of channels. When you publish a message, Rascal borrows a channel from the pool, sends the message, and returns the channel. If all channels are allocated, and the pool has spare capacity it will attempt to acquire another channel. If the pool cannot acquire another channel within a period of time, you will receive the ResourceRequest timed out error. After a period of inactivity surplus channels may be destroyed.

In contrast, subscriptions do not use a pool. They are each allocated a dedicated channel when or call subscribe (or maybe when you attach the on 'message' event handler). Channels are destroyed automatically when you unsubscribe.

So depending on your pool configuration, how frequently you publish messages, and how many queues you are subscribing to, you could exceed the channel allocation, preventing the pool acquiring more channels and new subscriptions. You can increase the channel allocation by specifying a value for channelMax in your connection configuration. I believe you can also set channelMax=0 to obtain up to 65535 channels.

cressie176 commented 11 months ago

is there any way to get the no channels which are in active state

You can see the number of channels in the RabbitMQ Management console. You can also enable Rascal's debug by setting the DEBUG environment variable to rascal:Vhost, however this could get very noisy in production

SaiKrishnaMaddu commented 11 months ago

hi @cressie176

Thanks for responding and helping to debug this issue.

If we set channelMax=0 or some value, will there be any constraint on the CPU and memory, because currently we are having the CPU limit as 0.1 vCpu and memory of 512MB.

So now if we increase the channelMax will this memory be sufficient.

cressie176 commented 11 months ago

Increasing the number of channels will have a small direct impact on the RabbitMQ broker memory. By implication though, if your application was previously constrained by having two few channels, and you increase them, then the broker will have more work to do and CPU / memory may increase.

The same may be true for your application, however since it was previously accepting work that it was unable to complete, due to being blocked by the channel pool, you may find your application needs less memory. If it is busier however, the CPU usage may increase.

You'll have to monitor and see.

It may also be worth registering listeners for the busy, ready, blocked and unblocked broker events, even if all you do is log them.

https://github.com/onebeyond/rascal/?tab=readme-ov-file#flow-control https://github.com/onebeyond/rascal/?tab=readme-ov-file#blocked--unblocked

SaiKrishnaMaddu commented 11 months ago

Thanks @cressie176

changed channelMax to 1000 it worked and now able to publish the messages and able to create the new subscriptions. on a short term increased no of channels, on a longer go trying to handle with less no of queues and subscriptions.

Will register to the suggested events.

Thanks you so much for helping to get the issue resolved.