Open msrb opened 4 years ago
Considering https://www.mariuszwojcik.com/how-to-choose-prefetch-count-value-for-rabbitmq/ , https://github.com/fedora-infra/fedora-messaging/pull/224 and the fact that the prefetch
value is set to 0, I now wonder if the issue we've seen in the ticket above could not be linked to that configuration. Maybe setting the prefetch value to something other than 0
would mitigate this
Considering https://www.mariuszwojcik.com/how-to-choose-prefetch-count-value-for-rabbitmq/ , fedora-infra/fedora-messaging#224 and the fact that the
prefetch
value is set to 0, I know wonder if the issue we've seen in the ticket above could not be linked to that configuration. Maybe setting the prefetch value to something other than0
would mitigate this
I don't see how they are related. The error is pretty clear, the channel is closed because the delivery tag (147816) is unknown. Delivery tags are how acks/nacks identify the message they are acking/nacking, are server-assigned, and are scoped to a channel. This, at first glance, looks to be a client bug. For example, if it's receiving the message on one channel and sending an ack for that message using a different channel, that would result in this error.
It looks like messages are sometimes not getting acked when using RabbitMQ provider. There are suspicious error messages in Jenkins log:
This happens on 2 different Jenkins instances. The first is running in CentOS infrastructure, on OpenShift, and the second is deployed on AWS EKS.
Note Jenkins seems to be triggering builds on messages just fine. The problem is that messages that don't get acked will be re-delivered by broker after some time and Jenkins will thus trigger new builds again.
This problem was discovered by Fedora monitoring: https://pagure.io/fedora-ci/general/issue/125.