Open philipwfowler opened 1 year ago
The way I currently have large dels implemented in piezo
is that it will only hit del_x.y
if the mutation given is of the same format. So it may be best to have the default rules on del_0.0
, and re-prioritise so that's treated as a default rule similar to the *
rules.
Possibly worth grouping together with the piezo
changes required to flow through the evidence
Agreed, using a lower-bound rule as the effective default makes sense e.g. pncA@del_0.1, U
. Might specify a the threshold/just below the threshold for calling large deletions to make clear that ones with fewer deletions are handled differently.
Could we please keep this and the evidence
separate (unless the latter isn't much work) since I need to process all of CRyPTIC Release Two and this fixing this Issue will unblock that.
At present, very large deletions (think set at greater than 50% of a gene) e.g.
pncA@del_0.89
do not hit any rules in a catalogue which causespiezo
to crash as per below.We therefore need some default rules so that a large deletion in a resistance gene can return a
U
(equivalent topncA@indel_*
for smaller deletions) as well as specific rules with a % min threshold above which the rule is triggered (something likepncA@del_>=0.5, R
. Hence, as usual a specific rule can override a default rule.In the longer term we might need to think about how we harmonise indels across the length scales but that feels hard for now. site.05.subj.PMOP-0621.lab.MOP-184.iso.1.v0.12.4.per_sample.vcf.gz minor_alleles.txt