samtools / htslib

C library for high-throughput sequencing data formats
Other
784 stars 447 forks source link

Fix `@PG` linking when records make a loop #1702

Closed daviesrob closed 7 months ago

daviesrob commented 7 months ago

If all the @PG records passed to sam_hdr_link_pg() form a single PP loop, all entries in the sam_hrecs_t::pg_end array it builds get set to -1, indicating that there are no chain start points. These entries are then removed to make the final list, but due to a bug in handling the case where there are no PP links the number of entries was incorrectly set to 1 instead of 0. This could lead to an out-of-bounds read in sam_hdr_add_pg() when linking new @PG entries to the existing ones.

Fix by ensuring that only valid end points are returned in the sam_hrecs_t::pg_end array, and the length is set to zero if no ends are detected due to a loop.

Adds a warning if a @PG record is found with a PP link to itself. Detecting longer loops is left for future work. Fixes another warning which incorrectly said 'SN' instead of 'ID' in its message.

Adds an assert() in sam_hdr_add_pg() to catch any other cases of the out-of-bounds read.

Adds tests for loopy @PG records.

Thanks to @OctavioGalland for the bug report. Fixes #1694