KarchinLab / open-cravat

A modular annotation tool for genomic variants
MIT License
110 stars 27 forks source link

Missing variants in the annotated VCF #156

Open skoul9 opened 1 year ago

skoul9 commented 1 year ago

I applied OC to annotate a VCF file. I noticed that the number of variants differed between the original and the OC-generated annotated VCF files. Overall, three rows were missing from the annotated VCF file. I was able to track that one of them failed due to a missing REF, which was reported in the log and err files, but for the other two, I could not find out why. I checked the FORMAT and INFO data but did not notice anything unusual. Specifically, I checked the values of the dropped rows against the range of values of the remaining rows in FORMAT and INFO. Does OC have a filtering step? Note that the VCF version is different between the two. The original version is 4.1, whereas OC generates version 4.2. Is there anything else I could check?

kmoad commented 1 year ago

OC doesn't have a filtering step. If there are lines missing it's a problem. Can you send us the problem lines and vcf headers? If it's helpful for data privacy, you could email to support@opencravat.org

skoul9 commented 1 year ago

Thanks! I emailed the header and the missing lines' data.