Open ThomasHickman opened 6 years ago
Thanks for the report!
I think the temporary workaround for this would be to use bcftools annotate to remove AD (which isn't used by hap.py).
In the future, the code that uses VariantReader will be retired, but I'll include a fix for this in the next version.
At https://github.com/Illumina/hap.py/blob/6c907ce3b02956bc239022db6edea7c48d6ddb8b/src/c++/lib/variant/VariantReader.cpp#L735, the size allocated for the array
ad
isadcount
, whereas in the function https://github.com/Illumina/hap.py/blob/6c907ce3b02956bc239022db6edea7c48d6ddb8b/src/c++/lib/tools/BCFHelpers.cpp#L532 (which gets called at VarientReader:738 withad
) writesvalues.size()
elements ( wherevalues
is a vector resembling the elements in theAD
entry of a sample), causing a buffer overflow.This could either be solved by: exiting on an incorrect AD field, truncating the interpreted
AD
fields so that they are of correct length or making the size ofAD
be the size of the actualAD
fields (I haven’t actually looked though the code to see how this code path is used and what would be appropriate).This bug can be shown to happen in the following vcf (I’ve exaggerated the number of AD fields to make problems happen). This should segfault on the next
delete []
statement due to heap corruption: