Open mmokrejs opened 6 months ago
There seems to be some confusion. Running with --local-csq
or without can give different outputs, and that's a feature, not a bug. In the haplotype-aware mode multiple variants can be part of the same consequence prediction ("compound variant"). The program outputs only one record detailing the compound variant and will print pointers to this line for the rest. There is no plan to change this.
Hi Peter, I wrote a simple parser for the consequences output by the
--local-csq
code. It turned out the haplotype-aware code outputs way different format. I am guessing there are some pointers to previous lines (see e.g.@988
below) which make processing harder as I would need to cache previous consequence (e.g.missense
value). But it also seems two resulting consequences are on a single line. So in summary, the--local-csq
has output everything (actualy three) consenquences on a single line. The haplotype aware caller has ouput 7 consequences spanning 3 lines if I am counting correctly (yes, after reformatting throughbcftools +split-vep
).The former
--local-csq
output was:Would it be possible to mimic the previous output format?
I don't understand the dots in the 4th and 5th column of the haplotype-aware results and how to work with the supposedly "next data" section in columns 10 to 12. I know this is a TSV output from
bcftools +split-vep
but I hope you get my point. You have this particular testcase data in your email from yesterday. Am I just misunderstanding the output or have unreal expectations of the output format?