Open davmlaw opened 2 months ago
If a variant is dropped it means that neither of the two anchors
could be mapped using the chain files or if the two anchors mapped to different chains, or if the two anchors mapped to locations too far from each other. Compared to other liftover tools the reasons for dropping a variant are more limited and basically it is always related to the chain breaking or missing at the locus. Would this information really be useful?
When a liftover of their data fails, the bio people pick half dozen variants then we do a deep dive to try and work out what happened. This is to verify that it's not due to a bug in the code
I think it reassures them that things are working correctly, and they appreciate much more seeing an error "two anchors mapped to locations too far from each other" vs "liftover failed"
I can try and write a pull request but haven't written C in close to 20 years...
Maybe give me some minimal documentation for how you would like to see the INFO field engineered, given the three possible return codes, in a way that you deem usable by bio people, and I will write the code myself
For my particular use case, I'll be processing the VCF file then displaying it in a GUI for the BIO people
There are no enums in VCF types so I guess just make it a string and fill it in with constants, I'd expect the info to look a bit like:
##INFO=<ID=REJECTION_REASON,Number=A,Type=String,Description="Reason for rejection. Will be 1 of NO_MAP (neither of the two anchors could be mapped using the chain files), DIFFERENT_CHAINS (two anchors mapped to different chains) or DISTANCE_EXCEEDED (the two anchors mapped to locations too far from each other)">
But feel free to change to whatever you think best. Thanks a lot!
variants can also end up in the reject list for other reasons, eg the contig not being defined in the header - there should be a code for this (eg VCF record error) to distinguish it from ones to do with chains
Hi, thanks looks good.
Ran dev version with --write-reject
and VCF was written with FILTER containing the reason
Liftover can fail for a variety of reasons...
Many of the liftover functions return -1 on any error, and the check is < 0
If you defined a range of negative constants, you could return specifically what went wrong, then look it up then add that to the INFO of rejected variants