linagora / james-project

Mirror of Apache James Project
Apache License 2.0
70 stars 63 forks source link

Make all queues on Rabbitmq quorum queue when option enabled #5149

Closed Arsnael closed 3 months ago

Arsnael commented 3 months ago

When the quorum option is enabled, only some queues are quorum, not all.

Do a quick POC first

Make all queues quorum when the option is enabled, and do perf tests to see if it has an impact or not.

Arsnael commented 3 months ago

old POC: https://github.com/apache/james-project/pull/2065

quantranhong1999 commented 3 months ago

continuation: https://github.com/apache/james-project/pull/2191

next: perf test on sandbox to see if function well

quantranhong1999 commented 3 months ago

TMail part: https://github.com/linagora/tmail-backend/pull/1008

TMail has started successfully with all quorum queues so far.

quantranhong1999 commented 3 months ago

IMAP Performance test

RabbitMQ (full quorum queues)

Image

Slower than the TMail 0.9.0 release but seem still fine though.

RabbitMQ + Redis

Image

Performance is very good (a bit better than the 0.9.0 release - maybe because of other improvements).

quantranhong1999 commented 3 months ago

I and @chibenwa played a bit breaking RabbitMQ nodes while firing the IMAP performance test.

In short, James can recover well from the RabbitMQ outage. There is a 5-second period in which James throws errors while dispatching the events (may be a serious issue if that is the MailQueue).

I will continue to improve that.

quantranhong1999 commented 3 months ago

There is a 5-second period in which James throws errors while dispatching the events (may be a serious issue if that is the MailQueue).

This helped: https://github.com/apache/james-project/pull/2191/commits/2e2415a1790b2f0a66d5fa73e4330090d034f115 This may help also: https://github.com/linagora/james-project/issues/5162

In short, James can recover well from the RabbitMQ outage.

This is the case for both the RabbitMQ event bus and the RabbitMQ+Redis event bus. We should be good.

quantranhong1999 commented 3 months ago

Next steps: create a Jira ticket, summarize the quorum queue work on the mailing list, and polish the POC.

chibenwa commented 3 months ago

<3