broadinstitute / seqr-loading-pipelines

hail-based pipelines for annotating variant callsets and exporting them to elasticsearch
MIT License
22 stars 20 forks source link

add new clinvar pathogenicity and contig filter workaround #713

Closed jklugherz closed 8 months ago

jklugherz commented 8 months ago

bug 1: Hail 0.2.128 crashes when we filter the imported clinvar table with ht.filter(ht.locus.contig != 'MT'). If we just don't include that contig when we import the vcf, the MT loci will be ignored and we don't need the filter.

bug 2:

HailUserError: Error summary: HailException: Key "low_penetrance/Established_risk_allele" not found in dictionary. Keys: ["Affects","association","association_not_found","confers_sensitivity","drug_response","low_penetrance","no_classification_for_the_single_variant","no_classifications_from_unflagged_records","not_provided","other","protective","risk_factor",null]

The CNLSIG that caused the error: [ 'Pathogenic/Likely_pathogenic/Pathogenic', '_low_penetrance/Established_risk_allele', ]

We added a new .replace() line in parsed_clnsig() which adds a new pathogenicity 'Pathogenic/Likely_pathogenic/Established_risk_allele' above 'Pathogenic/Likely_pathogenic/Likely_risk_allele'.

This requires no seqr changes because the new pathogenicity is within the pathogenic range.