linagora / james-project

Mirror of Apache James Project
Apache License 2.0
70 stars 63 forks source link

[PGSQL] LISTEN/NOTIFY for Postgresql #5199

Closed Arsnael closed 2 weeks ago

Arsnael commented 1 month ago

Listen and notify mechanisms from Postgresql might be able to replace usage of rabbitmq and redis for notifications with the pg implementation

Need to study and groom the topic. very encouraged as well to do first some POCs to see if it's viable or not.

Arsnael commented 1 month ago

Didnt take time to read yet, but likely worth reading: https://tapoueh.org/blog/2018/07/postgresql-listen-notify/

chibenwa commented 1 month ago

We need to distinguish PUB/SUB topics (ie implement something like redis part of the eventbus) which should be straightforward from queuing topics which might be trickier...

A first scope might be to only target the "redis eventbus" scope.

To handle queueing well, MailQueue would need to be figured out as well as the TaskManager.... ? -> trickier indeed.

chibenwa commented 1 month ago

Indeed queuing can be done but is way trickier.

CF FOR UPDATE SKIP LOCKED and to be done well, table is needed, transaction is needed and listen/notify is likely needed too. I have question regarding LISTEN/NOTIFY usage for queuing scenario as basically all james nodes would be notified on each message all the time...

chibenwa commented 1 month ago

Scaling needs to be taken into account, channel number should likely not be arbitrary large...

chibenwa commented 1 month ago

Anyway, I still suggest a POC as this could be meaningful for medium sized installs....

Arsnael commented 1 month ago

I've been trying to look around as well and I'm reaching similar conclusions.

Honestly why not doing POCs to explore the possibility, but that does not look like a win to me in the first place.

There is no at-least-once or at-most-once delivery mechanisms.

As soon as you scale up and have more than one listeners, every listeners is notified. Also there is no delivery garantee, so if listeners fail to get the notification, it's lost (defo an issue for mail queue, not much for user notification that we moved to redis... but the multiple listener notification could still be a bit problematic?)

A table would be needed to persist items in the mail queues, and extra code to manage saving those in the queue, notify, remove them from the table when you listened successfully (race issue if multiple listeners maybe?) and retry mechanism if failed to deliver.

It sounds like a lot of work for likely not so much benefits in terms of perf (or probably even degrading a lot compared to rabbit for example).

Anyway, I still suggest a POC as this could be meaningful for medium sized installs....

Would it be? I feel like for a medium size install a rabbitmq for example does not take much resources and would be much better and acceptable.

I feel for those kind of installs going back to investigate postgres indexing to get rid of opensearch (that can be much heavier to operate) would make more sense compared to this IMO (also seen this mentioned a few times in the gitter lately, I think a part of the community seems to expect indexing on postgres). Likely opensearch will be more performant, specially on full text search, but it can be too heavy as an option for some organization I think.

quantranhong1999 commented 1 month ago

What is the scope of this work? Notifications or queuing use cases, or both?

vttranlina commented 1 month ago

What is the scope of this work? Notifications or queuing use cases, or both?

Notifications?

A first scope might be to only target the "redis eventbus" scope.

Arsnael commented 1 month ago

As a first scope yes just redis notifications. Not personally convinced (specially as all listeners get notified if you have more than one) but I guess would be a POC to see if there would be any benefit compared to just using rabbitmq for both mail queue and notifications.

LISTEN/NOTIFY from pgsql seems too primitive IMO for James needs. I think it would make sense only if you could use postgres for mail queue and notifications. If only for notifications and still use rabbit for mail queues, I think just simply use rabbit then. But lets prove it with a POC

hungphan227 commented 1 month ago

Driver: r2dbc can do notify/listen, jooq cannot

Arsnael commented 1 month ago

@chibenwa : after discussing it with the team, nobody is really convinced here on this.

We think it's reasonable in a first attempt to do a POC on user notifications with postgres (the brick we moved from rabbit to redis) and do some perf tests on it.

If this is better than just using rabbitmq for all notifications, then maybe we can talk about an other POC regarding the mail queue.

If it's not, then we don't see the point in investing further effort on a postgres mail queue.

Thoughts?

chibenwa commented 1 month ago

after discussing it with the team, nobody is really convinced here on this. Thoughts?

Ok for quick poc. If nobody is convinced then we likely should not invest much.

hungphan227 commented 1 month ago

found postgres queue impl https://tembo-io.github.io/pgmq/ :)) not sure if it helps

Arsnael commented 1 month ago

@hungphan227 Interesting find! No java client library for this though, but I guess in the worst case if we try it will be down to wrapping requests into r2dbc

hungphan227 commented 1 month ago

However, this extension requires to be installed in postgres server. In case of postgres service provided by parties like OVH, it seems to be impossible to apply.

Image

quantranhong1999 commented 1 month ago

Ok for quick poc. If nobody is convinced then we likely should not invest much.

Ticket for the POC: https://github.com/linagora/tmail-backend/issues/1084

Should be enough for now I guess?