Bcftools call crash (on illegal input)

jkbonfield commented 6 years ago

I'm fuzz testing things and found a bug which I don't know how to fix.

In mcall we have:

    // Get the genotype likelihoods                                                                                     
    call->nPLs = bcf_get_format_int32(call->hdr, rec, "PL", &call->PLs, &call->mPLs);
    if ( call->nPLs!=nsmpl*nals*(nals+1)/2 && call->nPLs!=nsmpl*nals )  // a mixture of diploid and haploid or haploid \
only                                                                                                                    
    error("Wrong number of PL fields? nals=%d npl=%d\n", nals,call->nPLs);

    // Convert PLs to probabilities                                                                                     
    int ngts = nals*(nals+1)/2;
    hts_expand(double, call->nPLs, call->npdg, call->pdg);
    set_pdg(call->pl2p, call->PLs, call->pdg, nsmpl, ngts, unseen);

So it explicitly checks nPLs against one of two possible values. If neither are true, then it calls error, but if either are true then it's fine. Given later on ngts = nals*(nals+1)/2 the first check is essentially call->nPLs!=nsmpl*ngts.

In my data we have nals=3 and nsmpl=1, meaning the check is nPLs != 6 && nPLs != 3 with ngts = 6.

Then set_pdg does this:


        for (j=0; j<n_gt; j++)
        {
            if ( PLs[j]==bcf_int32_vector_end )

So PLs[j] accesses from 0 to 5 (ngts-1), but we had a check on nPLs being 3 or 6. My file with 3 therefore causes it to crash.

Obviously this is wrong data, but that's what the fuzzer is there to generate. My question is what should the correct check be? I'm leaning towards this:

 if ( (call->nPLs!=nsmpl*nals*(nals+1)/2 && call->nPLs!=nsmpl*nals) || (nals+1)/2 > nsamp )

That appears to match the case of more genotypes than the number of samples implies is possible. Is this the correct check?

jkbonfield commented 6 years ago

Edit: not quite correct, but it's clearer anyway to simply rearrange it and have || nPLs < ngts as it's more descriptive.

Do you want a PR with a few simple fixes in (I have 3 I think so far)?

jkbonfield commented 6 years ago

See https://github.com/jkbonfield/bcftools/tree/afl_fix for my work in progress. Only a few tiny tweaks, but without these AFL has had over 200k crashes on mpileup | call so far! (I haven't had time to refuzz with the fix in place yet.)

pd3 commented 6 years ago

This code assumes it is processing mpileup output, so this case should never happen. But I am happy to accept the pull request. Thank you

jkbonfield commented 6 years ago

I guess it's fair that people use bcftools mpileup | bcftools call essentially as a single stage, so the call part should expect input from mpileup.

What's more of an issue is where mpileup crashes, as that's the sort of thing I could imagine a web based submission framework being open to abuse (depending on the nature of the problem). Crashing due to abort() calls isn't friendly, but there's not really anything you can exploit - and we have lots of those in htslib's vcf parser. Crashing due to reading beyond an array can rarely expose things it shouldn't, while crashing due to writing beyond an array will often permit remote code execution exploits.

Everything I've found so far though has either been in the bcftools call part or just assert(0) / abort() issues, so harmless I think. The only reason I fixed those was simply to make it easier to see any genuine crashes - I'm just spamming tests at the CIGAR-P patch basically.

samtools / bcftools

Bcftools call crash (on illegal input) #794