samtools / htslib

C library for high-throughput sequencing data formats
Other
784 stars 447 forks source link

Allocation size too big/invalid memory access during `extend_ref`/`cram_add_to_ref` #1699

Closed OctavioGalland closed 7 months ago

OctavioGalland commented 7 months ago

Summary

Allocation size too big during extend_ref during parsing of a crafted SAM/FASTA file pair. If this error is ignored, it leads to an invalid memory access during cram_add_to_ref.

Environment

Built using LLVM 14 with ASAN on Ubuntu 22.04

How to reproduce

Build with ASAN on latest commit like so:

git clone --recursive https://github.com/samtools/htslib
cd htslib
autoreconf -i
CC=clang-14 CXX=clang++-14 CFLAGS="-fsanitize=address -g" CXXFLAGS="-fsanitize=address -g" LDFLAGS="-fsanitize=address -g" ./configure
make -j$(nproc)

git clone --recursive https://github.com/samtools/samtools
cd samtools
autoheader
autoconf -Wno-syntax
CC=clang-14 CXX=clang++-14 CFLAGS="-fsanitize=address -g -I$(pwd)/../htslib" CXXFLAGS="-fsanitize=address -g -I$(pwd)/../htslib" LDFLAGS="-fsanitize=address -g -L$(pwd)/../htslib" ./configure
make -j$(nproc)

Within the samtools folder, get poc file and reproduce with:

echo -ne "QFNRCVNOOmMxCUxOOjFATDIyAAACACoKczEJMAljMQkyCTAJMTBJCSoJMAkwCUFDQ0fDUUdUVEEJ
KglDWlpaWloqCnMxCTAJYzEJMTExMTExMTExMTExMTExMTEyCTAJMTBNCSoJMAkwCUFDQ0fDUUdU
VEEJKglDWlpaWlpaCSoqKkkyKzIyMjIyCUhEOmMyMjIgIglDWlpaWlpaCU1EQ1oAKgpzMQkwCWMx
CTIJMAkxMEkJKgkwCTAJQUNDR8NRR1RUQQkqCUNaWlpaWioKczEJMAljMQkyCTAJMTBJCSoJMAkw
CUFDQ0fDUUdUVEEJbGxsbGxsbGxsbGxsbGxsbGxsbGxsKglDWlpaWlpaCSoyMjIyMjIyMjIyMjIy
MjIaMjIyMjIyMjIyMjIyMjIyOzIyMjIyMjIyMjIyMjIyMjIyMjIyKioqSTIrMjIJTURDWloyMjIg
MgpzMQkwCWMxCTIJMAkxME0JKgkwBDAJQUNDR8NRR1RUQQkqCUNaWlpaWlpAU1EJU04JQFNeCVNO
OlNkMgkwCWMxIzP0LwkxME0JKgkwCTAJQ0NHQ0dHVFRDRwkqKioqKioqKioqMjJHVFQJCTAJQUFD
Q0dDTzhNVAkqKioqKioqKioqCnMxMjIyMjIyMjI=" | base64 -d > poc
./samtools view -C -T ../htslib/test/c2.fa poc

Which on my setup outputs:

[W::sam_hdr_sanitise] Unexpected NUL character in header. Possibly truncated
[W::sam_hdr_sanitise] Missing trailing newline on SAM header. Possibly truncated
[W::cram_get_ref] Reference file given, but ref 'c1' not present
[W::cram_get_ref] Failed to populate reference for id 0
[W::cram_write_SAM_hdr] No M5 tags present and could not find reference
[W::cram_write_SAM_hdr] Enabling embed_ref=2 option
[W::cram_write_SAM_hdr] NOTE: the CRAM file will be bigger than using an external reference
CRAM-���i���yyu@SQ  SN:c1   LN:1@L22
@PG ID:samtools PN:samtools VN:1.18-21-g528e1b2 CL:./samtools view -C -T ../htslib/test/c2.fa poc
g�cAA�{Np[E::sam_parse1] SEQ and QUAL are of different length
[W::sam_read1_sam] Parse error at line 5
=================================================================
==12293==ERROR: AddressSanitizer: requested allocation size 0x2501e734690ae7f (0x2501e734690be80 after adjustments for alignment, red zones etc.) exceeds maximum supported size of 0x10000000000 (thread T0)
    #0 0x55b435cb1d96 in __interceptor_realloc (/home/octavio/samtools/samtools+0x133d96) (BuildId: 7078ea94d4e08689f85e1df47e2d609c021d2440)
    #1 0x55b4360a2071 in extend_ref /home/octavio/htslib/cram/cram_encode.c:1455:17
    #2 0x55b4360a3ed5 in cram_add_to_ref_MD /home/octavio/htslib/cram/cram_encode.c:1545:17
    #3 0x55b4360a238a in cram_add_to_ref /home/octavio/htslib/cram/cram_encode.c:1579:19
    #4 0x55b43608c275 in cram_generate_reference /home/octavio/htslib/cram/cram_encode.c:1668:13
    #5 0x55b43607fee5 in cram_encode_container /home/octavio/htslib/cram/cram_encode.c:1876:17
    #6 0x55b4360f296c in cram_flush_container /home/octavio/htslib/cram/cram_io.c:4128:14
    #7 0x55b4360f3795 in cram_flush_container_mt /home/octavio/htslib/cram/cram_io.c:4280:16
    #8 0x55b4360fe742 in cram_flush /home/octavio/htslib/cram/cram_io.c:5431:19
    #9 0x55b435f8d2e7 in hts_flush /home/octavio/htslib/hts.c:1667:16
    #10 0x55b435f06084 in vprint_error_core /home/octavio/samtools/sam_utils.c:48:26
    #11 0x55b435f0645c in print_error_errno /home/octavio/samtools/sam_utils.c:71:5
    #12 0x55b435d0f472 in stream_view /home/octavio/samtools/sam_view.c:762:9
    #13 0x55b435d0abe8 in main_samview /home/octavio/samtools/sam_view.c:1363:15
    #14 0x55b435d89eed in main /home/octavio/samtools/bamtk.c:244:55
    #15 0x7f0a51229d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

==12293==HINT: if you don't care about these errors you may set allocator_may_return_null=1
SUMMARY: AddressSanitizer: allocation-size-too-big (/home/octavio/samtools/samtools+0x133d96) (BuildId: 7078ea94d4e08689f85e1df47e2d609c021d2440) in __interceptor_realloc
==12293==ABORTING

If I allow asan to ignore this error by running:

ASAN_OPTIONS="allocator_may_return_null=1" ./samtools view -C -T ../htslib/test/c2.fa poc

I get:

[W::sam_hdr_sanitise] Unexpected NUL character in header. Possibly truncated
[W::sam_hdr_sanitise] Missing trailing newline on SAM header. Possibly truncated
[W::cram_get_ref] Reference file given, but ref 'c1' not present
[W::cram_get_ref] Failed to populate reference for id 0
[W::cram_write_SAM_hdr] No M5 tags present and could not find reference
[W::cram_write_SAM_hdr] Enabling embed_ref=2 option
[W::cram_write_SAM_hdr] NOTE: the CRAM file will be bigger than using an external reference
CRAM-���i���yyu@SQ  SN:c1   LN:1@L22
@PG ID:samtools PN:samtools VN:1.18-21-g528e1b2 CL:./samtools view -C -T ../htslib/test/c2.fa poc
g�cAA�{Np[E::sam_parse1] SEQ and QUAL are of different length
[W::sam_read1_sam] Parse error at line 5
==12298==WARNING: AddressSanitizer failed to allocate 0x2501e734690ae7f bytes
AddressSanitizer:DEADLYSIGNAL
=================================================================
==12298==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x56459f43aa16 bp 0x7fff399273f0 sp 0x7fff39927250 T0)
==12298==The signal is caused by a READ memory access.
==12298==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
    #0 0x56459f43aa16 in cram_add_to_ref /home/octavio/htslib/cram/cram_encode.c:1615:60
    #1 0x56459f424275 in cram_generate_reference /home/octavio/htslib/cram/cram_encode.c:1668:13
    #2 0x56459f417ee5 in cram_encode_container /home/octavio/htslib/cram/cram_encode.c:1876:17
    #3 0x56459f48a96c in cram_flush_container /home/octavio/htslib/cram/cram_io.c:4128:14
    #4 0x56459f48b795 in cram_flush_container_mt /home/octavio/htslib/cram/cram_io.c:4280:16
    #5 0x56459f496742 in cram_flush /home/octavio/htslib/cram/cram_io.c:5431:19
    #6 0x56459f3252e7 in hts_flush /home/octavio/htslib/hts.c:1667:16
    #7 0x56459f29e084 in vprint_error_core /home/octavio/samtools/sam_utils.c:48:26
    #8 0x56459f29e45c in print_error_errno /home/octavio/samtools/sam_utils.c:71:5
    #9 0x56459f0a7472 in stream_view /home/octavio/samtools/sam_view.c:762:9
    #10 0x56459f0a2be8 in main_samview /home/octavio/samtools/sam_view.c:1363:15
    #11 0x56459f121eed in main /home/octavio/samtools/bamtk.c:244:55
    #12 0x7f7cf2e29d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #13 0x7f7cf2e29e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #14 0x56459efc6b24 in _start (/home/octavio/samtools/samtools+0xb0b24) (BuildId: 7078ea94d4e08689f85e1df47e2d609c021d2440)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/octavio/htslib/cram/cram_encode.c:1615:60 in cram_add_to_ref
==12298==ABORTING