Clinical-Genomics / genmod

Annotate models of genetic inheritance patterns in variant files (vcf files)
http://moonso.github.io/genmod/
MIT License
74 stars 18 forks source link

genmod score, data_type=string appears to match when the rule is a substring of the string in the VCF #87

Open bjhall opened 6 years ago

bjhall commented 6 years ago

If the VCF has "CLNSIG=Likely_pathogenic;", it will incorrectly match the Pathogenic-rule (which has higher priority than the Likely pathogenic-rule, since "Pathogenic" is a substring of "Likely_pathogenic" (case insensitive).

This appears to be the code doing the matching: https://github.com/moonso/extract_vcf/blob/master/extract_vcf/plugin.py#L342

Minimal example files: minimal_rankmodel_string.ini.txt minimal_string.vcf.txt

$ genmod score -i test -c minimal_rankmodel_string.ini.txt -r minimal_string.vcf.txt -o minimal_string.score

Gives me RankResult=5 (for Pathogenic), rather than the expected 2 (for Likely_pathogenic)!