The main change here is how we update the state of compaction jobs, in particular when workers are polling with state updates and ask for jobs.
The current implementation intermixes the persistence (boltdb) and in-memory state updates. This causes a few cases where an error could leave the 2 storage layers in an inconsistent state.
The new implementation does everything in memory first and constructs a list of items that need to be durably stored. If we fail to durably store something or otherwise end up in an unexpected state while persisting, the application will panic.
Bonus:
handle failed compaction jobs (with max retries)
prioritize jobs on compaction level before lease expiry
unit tests for compaction job creation and state management
The main change here is how we update the state of compaction jobs, in particular when workers are polling with state updates and ask for jobs.
The current implementation intermixes the persistence (boltdb) and in-memory state updates. This causes a few cases where an error could leave the 2 storage layers in an inconsistent state.
The new implementation does everything in memory first and constructs a list of items that need to be durably stored. If we fail to durably store something or otherwise end up in an unexpected state while persisting, the application will panic.
Bonus: