samtools / htslib

C library for high-throughput sequencing data formats
Other
784 stars 447 forks source link

Assertion violation during `cram_beta_encode_init` #1697

Closed OctavioGalland closed 7 months ago

OctavioGalland commented 7 months ago

Summary

Assertion violation in cram_beta_encode_init during parsing of a crafted SAM/FASTA file pair.

Environment

Built using LLVM 14 with ASAN on Ubuntu 22.04

How to reproduce

Build with ASAN on latest commit like so:

git clone --recursive https://github.com/samtools/htslib
cd htslib
autoreconf -i
CC=clang-14 CXX=clang++-14 CFLAGS="-fsanitize=address -g" CXXFLAGS="-fsanitize=address -g" LDFLAGS="-fsanitize=address -g" ./configure
make -j$(nproc)

git clone --recursive https://github.com/samtools/samtools
cd samtools
autoheader
autoconf -Wno-syntax
CC=clang-14 CXX=clang++-14 CFLAGS="-fsanitize=address -g -I$(pwd)/../htslib" CXXFLAGS="-fsanitize=address -g -I$(pwd)/../htslib" LDFLAGS="-fsanitize=address -g -L$(pwd)/../htslib" ./configure
make -j$(nproc)

Within the samtools folder, get poc file and reproduce with:

echo -ne "QFNRCVNOOmMyCUxOOis3Nzc3Nzc3Nzc3KysrKysrKysgAAAAMApAU1Gbm5ubm5ubm4+bmwkqCTAJ
MAlBQ0NHQzAJMAlBQ0NHQ0dHm5ubm5u0m5ubm5ubm5ubm5ubm5ubm5ubm5ubm5ubm5ubm5ubm5ub
m5ubm5ubm5ubm5ubCUxOOis3OkkqKpubm5ubm5ubm5ubm5ubm5ubm5ubm5ubm5ubm5ubm5ubm5ub
m5ubCUxOOis3CnMxCTMJYzEJMgkwCTEwTQkqCTAJMAlBJkNHQ0dHVFRDCSoJQU46STEtMTAKczAJ
MAljMQkxCTAJMTBNCSoJMAkwCUFBQzBHQ0dHVFQJKioqKioqKioqKgpzMQkxMTExMTExMTExMTEx
MAljMQkxMTExMTExMTExMTExCTAJMTBNCSoJMAkwCUFBQ0NHQ0dHVFQJKioqKioqKioqKgpzMQkw
CWMxCTIJMAkxMGoJKgkwCTAJQUNDR0NHR1RUQwlhYWFhYWFhYWFhCnMxCTMJYzEJQAlNCSoJCUFO
OkkxLTEeMDFhYWFhYWFhK0FOOkkqKgpzYzEJMgkSCTMwTQkqCTAJMAlBJkNHQ0dHVFRDCSoJQU46
YTAKczAJMAljMQkxCTAJMTBNCSoJMAkwCUFBQ0NHQ0dHVFQKKioqKioqKioqKgpzMQkwCWMxCTIJ
MAkxMDcrKysrKysrKysrKysrKxMrKysgAAEAMApAU1EPQSy2CUxOOlNRCVNOOmMxUkdHVFZFclZW
AgAKQFNRU1EJMAk3IAAAADAKQFNRD1QJKioKczEJMwljMQkyCTAJMTBNm5ubm5ubm5ubm5ub" | base64 -d > poc
./samtools view -C -T ../htslib/test/c2.fa poc

Which on my setup outputs:

[W::sam_hdr_create] Ignored @SQ line with missing SN: tag
[W::sam_hdr_sanitise] Unexpected NUL character in header. Possibly truncated
[W::sam_hdr_sanitise] Missing trailing newline on SAM header. Possibly truncated
[W::sanitise_SQ_lines] Header @SQ length mismatch for ref c2, 7777777777 vs 9
CRAM-'���~�������m�A
                    �0@��_������;   ذ���kKAY�a?�?ӥ��;��㽂 ��#���)�|%���!�V��7f�
Vp��$g��R(��$�`gM-x���Z�S��vr�:x7�=��y
�V����V��*#��9��/4�9��%^F��q��3�w�
                                  ���5��__�O�G[W::sam_parse1] unrecognized reference name "c1"; treated as unmapped
[W::sam_parse1] unrecognized reference name "c1"; treated as unmapped
[W::sam_parse1] unrecognized reference name "c1"; treated as unmapped
[W::sam_parse1] unrecognized reference name "c1"; treated as unmapped
[E::parse_cigar] Unrecognized CIGAR operator
[W::sam_read1_sam] Parse error at line 6
samtools: cram/cram_codecs.c:1283: cram_codec *cram_beta_encode_init(cram_stats *, enum cram_encoding, enum cram_external_type, void *, int, varint_vec *): Assertion `max_val >= min_val' failed.
Aborted (core dumped)