Particular / NServiceBus.NHibernate

Persistence support for NServiceBus using NHibernate
https://docs.particular.net/nservicebus/nhibernate/
Other
12 stars 32 forks source link

RemoveEntriesOlderThan is not bounded #143

Open ramonsmits opened 8 years ago

ramonsmits commented 8 years ago

The current implementation both the client side queryover and server side delete statements are unbounded. If you have a very large number of items that will be deleted that that impacts locking and writing huge amounts of data in a single transaction to the journal log which can result in deadlocks

A better implementation is to do batched deletes in separate transactions until there are no more items to be purged.

@DavidBoike Already updated the QueryOver clientside to a server side single SQL query in the master branch but this is missing in develop.

Using SQL server this can be done via DELETE top (10000) FROM OutboxRecord. Oracle uses the LIMIT keyword and other databases might need a nested SELECT like DELETE FROM X WHERE id in (select id from X LIMIT 10000).

An alternative is: not having an bounded resultset, use the rowcount and if it passes a certain tresshold to log a WARNING that indicates that the cleanup interval might be set to low.

ramonsmits commented 8 years ago

After a chat with @DavidBoike we agreed that it would be hard to do in a database agnostic way.

We agreed on to test the affected row count and log a warning when a configurable arbitrary (6.000?) limit is passed. This means that if a customer is having more than 100 msg/s that he will be getting warnings to use a smaller cleanup interval.

https://github.com/Particular/NServiceBus.NHibernate/blob/develop/src/NServiceBus.NHibernate/Outbox/OutboxPersister.cs#L99

Also, the timer logic should make sure that the cleanup task cannot be overlapping. It should detect if the task is already running and if it is, log an error.

https://github.com/Particular/NServiceBus.NHibernate/blob/develop/src/NServiceBus.NHibernate/Outbox/NHibernateOutboxStorage.cs#L91

ramonsmits commented 8 years ago

To extend this, the timer logic should be auto sensing with a min and max period.

By default we should start with a minute.

In high volume environments we will automatically delete more often resulting in less likely to escalate to a page or table lock.

In low volume environments we will maybe just run the cleanup every hour.

All of these can be configurable:

ramonsmits commented 6 years ago

I checked if this is still relevant and the logic is still the same:

https://github.com/Particular/NServiceBus.NHibernate/blob/0289c7540243afa506e9ff93188daa4c19ac12e9/src/NServiceBus.NHibernate/Outbox/OutboxPersister.cs#L134