EspressoSystems / hotshot-query-service

Generic query service for HotShot applications
https://espressosystems.github.io/hotshot-query-service/
GNU General Public License v3.0
5 stars 1 forks source link

Pruner might delete unnecessary data under heavy disk utilization #633

Open jbearer opened 4 months ago

jbearer commented 4 months ago

We use Postgres's insight into disk utilization in deciding when to prune beyond the target retention and potentially all the way up to the minimum retention. However the disk utilization info might be stale if we have recently pruned some data that hasn't been autovacuumed yet. In this case we should just vacuum (or wait for an autovacuum) to recover space, rather than pruning further.

imabdulbasit commented 3 months ago

I think that waiting for auto vacuum might not be reliable and we need to clear up some space immediately. If we wait for auto vacuum then new data inserts might fail until auto vacuum kicks in. The default postgres auto vacuum settings may also cause a long delay before it runs the auto vacuum job. I am leaning more towards running manual vacuum so that new inserts do not fail.

imabdulbasit commented 3 months ago

We had errors related to insufficient shared memory when running manual vacuum, which prevented the whole batch insert from succeeding, as we were committing after the vacuum job. However, the chances of manual vacuum failing should be lower now, as we commit after each batch delete.