Refactor doDbCleanup to support deletion of multiple query_samples

The current code for doDbCleanup has a few problems.

Let's look at a scenario: a sample with hash xyz was queried 5 times. Right now the algorithm is to go over finished jobs, find a job connected to xyz, and then search the query_samples collection for a document associated with xyz. At this point samples_to_be_deleted will contain {"xyz": some_entry} where some_entry is just one of the 5 queries done for this sample - and if I understand the code correctly - it is not even guaranteed to be the one associated with the job (from the queue collection) we are currently looking at. The remaining 4 documents in query_samples will not be deleted.

Then we get to failed job deletion, and there are a couple of scenarios. If this sample was found when going over finished jobs, then xyz is already in sample_to_be_deleted so we skip deletion of query_samples/query_functions of all failed jobs for this sample. If this sample was never queried successfully without failing, then again we only delete the first failed query_samples/query_functions.

Instead, I think there should at least be a way to delete more query_samples/query_functions entries for a job. This is a draft for doing that, let me know what you think

danielplohmann / mcrit

Refactor doDbCleanup to support deletion of multiple query_samples #72