Eventuous / eventuous

Event Sourcing library for .NET
https://eventuous.dev
Apache License 2.0
447 stars 71 forks source link

Postgres subscription is flooding database server with polling queries #224

Closed oakie closed 9 months ago

oakie commented 1 year ago

I am using PostgreSQL in Azure as event store with Eventuous v0.14.0. When no new events are written to the store, the subscription completely overwhelms the database server, pushing it's CPU usage to a constant 100%. It would not be a problem if the event store was the only database on that server, but unfortunately this server is shared between multiple applications, causing performance issues for the others. I can see in the metrics that the server is hit with thousands of requests each second.

As I understand from reading this thread it is because the Postgres subscription is polling the database for new events, and when no new events are returned it immediately sends a new query without delay. Postgres has a feature to push updates to clients instead of using polling, and utilizing that would be ideal, but it might be non-trivial, as discussed in the thread linked above.

As a temporary fix for us, I have made a copy of PostgresSubscriptionBase from Eventuous.Postgres and added a simple 10 ms delay in the polling loop, which lead to a dramatic reduction in CPU usage on the database server. Would it be feasible to add this behaviour in the official packages? Maybe as opt-in using the subscription options?

fbjerggaard commented 1 year ago

I noticed this as well using PostgreSQL, and it would probably be a pretty easy "fix" to implement a configurable delay when polling the database.

The right way would probably be some kind of subscription, but I haven't really researched the possibility of that

However, be aware of #222 when using PostgreSQL in general - though it only really applies if you have multiple event publishers

alexeyzimarev commented 1 year ago

It should be configurable and auto-adjusting. I never got the time to implement it as I am not using Postgres in my daily work. It would be nice for someone to pick it up, also for SQL Server.

Here's how I see it working. The configuration should have the min and max polling intervals. When the subscription starts, it uses the min interval. If the query returns no events, it will start increasing the interval using some calculated increase value, for example (max - min) / 10. If no events are being appended, eventually it will reach the max interval and will be polling much less frequently.

It could also be possible to extend the functionality and enable an internal control plane for subscriptions. For example, when the app appends an event, it could signal the subscription to query immediately, so it will not wait for whatever current interval is.

alexeyzimarev commented 1 year ago

The right way would probably be some kind of subscription

It should be possible to use logical replication, but it's quote a bit of work. For it to work, there should be some implementation of switching from catch-up to real-time mode (and back) using ring buffers or something.