samtools / bcftools

This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html
http://samtools.github.io/bcftools/
Other
665 stars 240 forks source link

"Cannot constrain to A" error handling #534

Closed koelling closed 7 years ago

koelling commented 7 years ago

I am trying to genotype a sample against a reference VCF file, including homozygous 0/0 calls.

To do this, I first generated a TARGETS file from the reference VCF file which contains the three columns "CHR\tPOS\tALLELES", with ALLELES = "REF,ALT". I then use mpileup with the --positions TARGETS argument to generate a pileup at all sites, followed by a call with --keep-alts --multiallelic-caller --constrain alleles --targets-file TARGETS to calculate genotype likelihoods for each site.

This seems to work in principle, however it simply crashes in the rare case that the ALT allele in the sample does not match the ALT allele in the reference data set. The error message I get in these cases is "Cannot constrain to A" (where A is the ALT allele).

Looking at the code, the error is thrown by this line: https://github.com/samtools/bcftools/blob/3f13cfee86739a8904db15d69e12607600fd3f47/mcall.c#L1296

This makes it essentially impossible for me to use bcftools because there is no proper way to avoid this error. The only solution I could think of at the moment would be to call genotypes first without constraining alleles, annotate each site with the expected alleles, compare the ALTs to the expected ALTs, find the sites where there is a mismatch, remove those from the original pileup and then rerun the caller, this time including the --constrain parameter. However, this would obviously be an incredibly complicated way of dealing with this.

Is there any way you could change the way mismatches between the expected and actual ALT alleles are handled?

I think it would be much better if bcftools simply printed a warning to STDERR and then ignored the site, instead of immediately exiting as soon as it encounters such a problem.

In theory I think this might be as simple as making these three changes:

However, it would be great if someone more familiar with the code could have a look at this!

pd3 commented 7 years ago

Now addressed by this commit https://github.com/samtools/bcftools/commit/6cb790c5a575a35bf0e60e9eb95f6ecbbeb2bb56.

Thanks for this and sorry for the delay