KarchinLab / open-cravat

A modular annotation tool for genomic variants
MIT License
110 stars 27 forks source link

Invalid Reference Base #146

Closed mproberts99 closed 1 year ago

mproberts99 commented 1 year ago

Hello,

I've been using OpenCravat to annotate variants from Clair3, the long-read SNV/Indel caller. I noticed the output from Clair3 is inconsistent, particularly for indels, that causes issues with OC. Sometimes the indel will contain lowercase letters in the reference or alt base (ex: Taag instead of TAAG), causing the annotation to fail and leave the variant out of the final output. Is there an easy fix to rectify this or would it be better to take care of this upstream of using OpenCravat?

Thank you!

kmoad commented 1 year ago

We'd like to support this without you needing to make upstream changes. Can you confirm that these two variants should be annotated the same way? We can certainly change our vcf processing to do so.

POS    REF   ALT
1000   T     Taag
1000   T     TAAG
mproberts99 commented 1 year ago

Hi Kyle, Thanks for the quick response! Yes, can confirm those two variants should be annotated the same way. The change would be helpful and eliminate the need for an extra step before annotation. Thanks!

kmoad commented 1 year ago

This is now fixed. Lowercase accepted and is converted to uppercase when the input file is processed.

https://github.com/KarchinLab/open-cravat/commit/4ae17f01921f76496aea2a6e2c878c3f7d4cf66b