Closed daviesrob closed 1 year ago
Definitely an improvement, but I'm wondering now about the bcf_hdr_register_hrec
call at the start of the function. This takes a copy of hrec and adds it to the header dictionaries even when it subsequently fails later on.
I can't find a bcf_hdr_unregister_hrec
to remove it again if it fails. What is the consequence of having an extra dict item which isn't referred to by the header? Even if it's ignored, it does mean there is an extra pointer to it cached. Although maybe that's not an issue if the header freeing doesn't free via the hdr->dict
pointers but instead frees via get_hdr_aux()
function.
Ah wait, I'm wrong. hdr->dict
is get_hdr_aux()
, but for a specific element. :/
static inline bcf_hdr_aux_t *get_hdr_aux(const bcf_hdr_t *hdr)
{
return (bcf_hdr_aux_t *)hdr->dict[0];
}
That [0]
ought to be BCF_DT_ID
so it's not obscure, given that bcf_hdr_register_hrec
uses hdr->dict[BCF_DT_CTG]
meaning I had to go hunting for the defines to work out if it was manipulating the same dictionary. I'm also unsure why "aux" when it's "id"? There's some confusing terminology in this code.
Anyway, later on there is also use of BCF_DT_IT
in bcf_hdr_register_hrec
:
vdict_t *d = (vdict_t*)hdr->dict[BCF_DT_ID];
k = kh_get(vdict, d, str);
if ( k != kh_end(d) )
{
// already present
free(str);
if ( kh_val(d, k).hrec[info&0xf] ) return 0;
kh_val(d, k).info[info&0xf] = info;
kh_val(d, k).hrec[info&0xf] = hrec;
if ( idx==-1 ) {
if (hrec_add_idx(hrec, kh_val(d, k).id) < 0) {
return -1;
}
}
return 1;
}
k = kh_put(vdict, d, str, &ret);
So this function has clearly taken a copy of hrec
, irrespective of whether the function call fails, meaning the comment you added is incorrect. The user should probably just never touch hrec
again after calling our API, and maybe we should make it free hrec
itself given it mostly will already do this, even after an error.
Yes, leaving a dangling pointer in bcf_hdr_t::dict[].hrec[]
probably isn't good. I don't think it's a problem for cleaning up though - bcf_hdr_destroy()
removes hrec
s via bcf_hdr_t::hrec[]
. bcf_hdr_remove()
either iterates through bcf_hdr_t::hrec[]
or checks that the pointer it gets from the dictionary is in there although the latter is a bit dodgy as it might throw an assertion (or do nothing if assertions were disabled).
So I think the main risk of a dangling pointer in bcf_hdr_t::dict[].hrec[]
would come from failing to check that bcf_hdr_add_hrec()
worked, managing to obtain a copy via bcf_hdr_get_hrec()
and then using it. Apart maybe from the issue with bcf_hdr_remove()
, nothing should try to delete an hrec
that way.
I've added a commit that should prevent dangling pointers in bcf_hdr_t::dict[].hrec[]
.
While working on that, I also noticed that bcf_hdr_remove()
wasn't removing items from the bcf_hdr_aux_t::gen
dictionary. This is possibly a worse bug, as trying to retrieve the deleted item with bcf_hdr_get_hrec()
would return the stale pointer. A second commit fixes that problem, and adds some tests to ensure removal works correctly.
Scary stuff, but looks good. I checked it doesn't trigger issues with bcftools too.
Spotted while looking at the VCF parser.
bcf_hdr_add_hrec()
should neither callbcf_hrec_destroy(hrec)
nor store any pointers tohrec
when it returns -1, otherwise double frees or stale pointer dereferences may result.Remove
bcf_hrec_destroy(hrec)
call that was incorrectly made when handling a hash table insert failure, and movehdr->hrec
reallocation so that all possible failures occur beforehrec
is added into the header.Add
bcf_hdr_add_hrec()
documentation, including a warning that the caller should not touchhrec
after a successful return.It looks like the bug appeared in 3fe2a59034. You'd have to be very unlucky to actually trigger it as it requires a particular allocation to fail, but it's best to ensure that the problem can't occur.