Closed daviesrob closed 8 months ago
The change looks good, but I'm not sure it's sufficient.
It fixes the fuzzing problem, but it still generates a broken aux record as it doesn't add the nul termination when storing the RG verbatim. This then means we convert from BAM to CRAM without an error (just a warning), but produce a CRAM which cannot be decoded:
$ ./test/test_view /tmp/_.cram
[E::sam_format1_append] Corrupted aux data for read
Error writing output.
I'm tempted to say we should just short-cut the issue by having a hard fail. If aux is detected to be malformed, don't attempt to continue.
I can easily switch that to a hard fail, but should the same also be done for other Z-type tags? At the moment, they will also presumably make broken cram files.
Yes I guess it's something that is a general issue. So maybe it's a separate problem.
I've updated to make it fail on the bad data, and also added similar checks to the other places that handled Z
and H
type tags.
Thanks. Seems to do the trick :)
RG:Z tags without a proper NUL termination could lead to use of invalid data, or a heap overflow when the tag is passed to sam_hrecs_find_rg(), or hts_log_warning() if the former returns NULL. Fix by moving the line that skips to the end of the aux tag and then checking that it was terminated correctly. Should it not be, the aux parser is reset so the tag can be stored verbatim (the code that does that already handles badly-terminated Z tags).
Credit to OSS-Fuzz Fixes oss-fuzz 66369