Closes #353 - was caused by exomes and genomes writing to (and potentially resuming from) the same checkpoint path
There may have been a bug in Cat. 4 (de novo) here
The intention of this logic is
get all relevant consequences (VEP critical + missense)
filter transcripts to the ones which have those relevant consequence annotations
the final clause is supposed to be "filter variants to ones where there is either at least one transcript with a relevant consequence annotation, OR the variant has a substantial SpliceAI score"
the final clause instead filters for variants which have a consequential transcript annotation AND NO SpliceAI score
Not sure how often we'll expect variants with consequential transcript annotations AND spliceAI scores, but it's not impossible. With the current code we'll have been removing those from consideration as de novos
Proposed Changes
This does a lot of housekeeping, mostly moving the methods around to be in the chronological order in which they are called
Also changes the logger to stop using the root logger, not important
This clause in Cat4 assignment is corrected
We filter out Benign variants only where the number of ClinVar stars is >= 1. This ties into as-yet unreleased tweaks to the https://github.com/populationgenomics/ClinvArbitration codebase, so that we will now generate 0-star clinvar entries in the raw data, even if we don't intend to use them. Currently all our private ClinVar data is 1 star or greater, so this doesn't impact us.
Fixes
Proposed Changes
Checklist