esmero / strawberryfield

A Field of strawberries
GNU Lesser General Public License v3.0
10 stars 5 forks source link

Add method to validate/check number of Flavors (present/expected) for a given ADO #255

Open DiegoPino opened 1 year ago

DiegoPino commented 1 year ago

See https://github.com/esmero/strawberryfield/blob/a7ba7330cc5f278e66533f40c75267ef8369f495/src/StrawberryfieldUtilityService.php#L319

Sadly this (even still useful) is not enough. We need to also get "how many we should expect to have". Luckily we index also sequence_total as a value so the idea is:

Similar query, facet by sequence_total, get the "value" not the "count" from the extra data from the results. Sum them up. return both count and expected.

This is needed to be able to "avoid" reprocessing (as an option) ADOs where all matches/is in place. Even if the queue item workers will not run OCR again if present (we already have that) re-enqueing 700K OCRs just to check that is an overkill. Thinking of JSON patching etc.

DiegoPino commented 1 year ago

@alliomeria ping here. This is the idea. I will code this now so you can test. Thanks again for your work