netbox-community / netbox

The premier source of truth powering network automation. Open source under Apache 2. Public demo: https://demo.netbox.dev
http://netboxlabs.com/oss/netbox/
Apache License 2.0
15.43k stars 2.51k forks source link

Batch remove of old jobs #14679

Open doc-sheet opened 6 months ago

doc-sheet commented 6 months ago

NetBox version

v3.7.0

Feature type

Change to existing functionality

Proposed functionality

Remove old jobs in small batches Like Job.objects.filter(created__lt=cutoff)[:50] for example

Also I guess it's not very optimal to run this query twice - for count and delete queries.

Use case

When there are tens of millions old jobs count() and all() quries can't finish in some sensible time (even with statement_timeout=5s or so) https://github.com/netbox-community/netbox/blob/c78a792cccfdbe6f373c0d474b1620e56e5f9cf8/netbox/extras/management/commands/housekeeping.py#L75

Database changes

No response

External dependencies

No response

jeremystretch commented 4 months ago

Breaking the delete operation into smaller chunks isn't going to yield any performance gains. However, we could use Django's _raw_delete() method to bypass the signal handlers, which would speed up the bulk deletion considerably and should be safe for this specific use case.

doc-sheet commented 4 months ago

In my case it was a trade off between "cleanup don't work" and "cleanup lasts for hours".

Also i kinda don't understand why those millions of job records existed at all (some kind of glitch in scheduler I guess), but it's a different story.