Closed BCArg closed 1 year ago
having a closer look at the entries that were removed, I see that they have GQ
(genotype quality 0), so I guess this is the reason.
I just noticed the same now. Wonder how these could be annotated as PASS?
They also have an undefined genotype (GT).
That's correct, GT is always './.', from the entries that I checked. Indeed it is a bit dodgy that they are annotated as PASS, but other quality parameters are poor. Anyway, I reckon your tool is performing as expected, so I will close this issue, thanks for the assistance
Hey @BCArg, thanks for looking into this. One might of course considering implementing some warnings when encountering data like you have, but this is really unexpected input, I believe. Sadly, my experience is that VCFs from different callers rarely adhere strictly to the VCF specification, which makes it inherently difficult to cope with all scenarios. If you have not done so already, you might want to check out the vcf-validator, to get some feel for how "valid" your VCF file is.
I have run vct2tsvpy with default arguments i.e. only required arguments with the following command:
I have noticed, however, that some entries from the vcf, which have a PASS value under
FILTER
column were excluded from the output tsv file.For example, the entry below is present on the vcf:
but it is not present in the output tsv file, unless I pass the
--keep_rejected_calls
, in which case, the tsv file is complete.Below is a
vimdiff
screenshot, the left-hand side with--keep_rejected_calls
, right-hand side only with required arguments.Is this the expected behaviour? How come not passing
--keep_rejected_calls
excludes calls that have a PASS underFILTER
?Thanks in advance