PharmGKB / PharmCAT

The Pharmacogenomic Clinical Annotation Tool
Mozilla Public License 2.0
120 stars 39 forks source link

Discarded genotypes in report - reopen #128 #173

Closed krukanna closed 6 months ago

krukanna commented 6 months ago

I would like to reopen issue #128 I work with 30× WGS data. I am uploading a test vcf file with the positions of one NUDT15 gene and the final report in HTML format. 4 items are missing or not validated due to invalid REF or ALT. All entries required by PharmCAT for this gene are in the vcf and all are refs.

No indel detected in the sample, so positions chr13:48037782-48037788 are on separate lines in vcf, same for indel chr13:48040977-48040978. Why can't PharmCAT read these lines and deduce that for this indel this sample is REF respectively AGGAGTC and GA for the above mentioned indels. Information in the html report: "Discarded genotype at this position because REF in VCF does not match expected reference"

For positions chr13:48037825 and chr13:48041103 PharmCAT expects insertions, but in my sample such insertions were not detected, so again in the vcf file these positions are REFs: C/C and T/T. Why does PharmCAT ignore this information and return warning in html: "Genotype at this position has no ALT allele and an indel or repeat is expected. PharmCAT cannot validate this position"

Unfortunately, I cannot use the '-0' function because the missing items are not always ref, sometimes they may be het, but for different alt allele (not important in pharmacogenomics) and then the assumption is incorrect. It would be great if PharmCAT could deal with cases like this where the indel was not detected, but the gene has been sequenced and there is information about the REF positions in input file. test_vcf.zip

whaleyr commented 6 months ago

If you want to discuss #128 then you should post there. I'm closing this as a duplicate. Please copy your comment over there and we can keep the discussion going.