samtools / htslib

C library for high-throughput sequencing data formats
Other
804 stars 446 forks source link

Assertion violation during `cram_ref_decr_locked` #1696

Closed OctavioGalland closed 10 months ago

OctavioGalland commented 10 months ago

Summary

Assertion violation in cram_ref_decr_locked during parsing of a crafted SAM/FASTA file pair.

Environment

Built using LLVM 14 with ASAN on Ubuntu 22.04

How to reproduce

Build with ASAN on latest commit like so:

git clone --recursive https://github.com/samtools/htslib
cd htslib
autoreconf -i
CC=clang-14 CXX=clang++-14 CFLAGS="-fsanitize=address -g" CXXFLAGS="-fsanitize=address -g" LDFLAGS="-fsanitize=address -g" ./configure
make -j$(nproc)

git clone --recursive https://github.com/samtools/samtools
cd samtools
autoheader
autoconf -Wno-syntax
CC=clang-14 CXX=clang++-14 CFLAGS="-fsanitize=address -g -I$(pwd)/../htslib" CXXFLAGS="-fsanitize=address -g -I$(pwd)/../htslib" LDFLAGS="-fsanitize=address -g -L$(pwd)/../htslib" ./configure
make -j$(nproc)

Within the samtools folder, get poc file and reproduce with:

echo -ne "QFJHCUlEOgBAUkcJSUQ6AEBTUQlNTjpjRAlMTjqAMFpRCUFWOmNTMwlTTjoNDQ1AUEcJSUQ6UEc6
UEcJUCA6MVAKQFBHCUlEOjBAUyoqKioqLSoqKioKczEJMAljMgkyCTAJMTBNCSoJMAkwCUFDQ0dD
R0dUVEMJKioqKioqKioqKgpzCTAJYzEJMQkwCTEwTQkqCTAJMAlBQUNDRy5HR1RUCSoqKioqLSoq
KioKczEJMAljMgkyCTAJMTBNCSoJMAkwCUFDQ0dDR0dUVEsJKioqKioqKioqKgpzMgkwMAljMQkx
CTAJMTBNCSoJMAkwCUFBQ0NHLkdHVFQJKioqKiotKioqKgpzMQkwCWMyCTIJMAkxME0JKgkwCTAJ
QUNDR0NHR1RUQwkqKioqKioqKioqCnMJMAljMQkxCTAJMTBNCSoJMAkwCUFBQ0NHLkdHVFQJKioq
KiotKioqKgpzMQkwCWMyCTIJMAkxME0JKgkwCTAJQUNDR0NHR1RUSwkqKioqKioqKioqCnMyCTAw
CWMxCTEJMAkxME0JKgkwCTAJQUFDQ0cuR0dUVAkqKioqKi0qKioqCnMxCTAJYzIJMgkwCTEwTQkq
CTAJMAlBQ0NHQ0dHVFRDCSoqKioqKioqKioKcwkwCWMxCTEJMAkxME0JKgkwCTAJQUFDQ0cuR0dU
VAkqKioqKi0qKioqCnMxCTAJYzIJMgkwCTEwTQkqCTAJMAlBQ0NHQ0dHVFRDCSoqKioqKioqKioK
czIJMAljMQkzCTAJMTBNCSoJMAkwCUNDR0NHR1RUQ0cJKioqKioqMgkwCWMxCTMJMQkwCTEwTQkq
CTAJMAlBQUNDRy5HR1RUCSoqKioqLSoqKioKczEJMAljMgkyCTAJCUxOOjpQRwlQUDoxMApAUEcJ
SUQ6UEb+UCA6MVAKQFBHCUlEOjBAUwlMTjo6UEcJUFA6MTAKQFBHCUlEOjEwCVAgOjFQCkBQRwlJ
RDowQFMJTE46OlBHCVBQOjEwCkBQRw==" | base64 -d > poc
./samtools view -C -T ../htslib/test/c2.fa poc

Which on my setup outputs:

[W::sam_hdr_sanitise] Unexpected NUL character in header. Possibly truncated
[W::sam_hdr_sanitise] Missing trailing newline on SAM header. Possibly truncated
[W::sam_hrecs_update_hashes] Duplicate entry "" in sam header
[W::sam_hrecs_update_hashes] Duplicate entry "" in sam header
@PG' not present] Reference file given, but ref '
[W::cram_get_ref] Failed to populate reference for id 1
[W::cram_write_SAM_hdr] No M5 tags present and could not find reference
[W::cram_write_SAM_hdr] Enabling embed_ref=2 option
[W::cram_write_SAM_hdr] NOTE: the CRAM file will be bigger than using an external reference
CRAM-t����~��4�mNMk�@����s!�Um�wJ]V�����5�e
�%rq����}�bΊWƹإ�Z���i�{���$=.�`5C"�Q����G_�R{I��'5����*x۠؅C'%1^Jŭ��n���AZ����|_���=�3��ymO6۠m�2T6��d���+�w�?��������We�U�9�e�Z
                                                                  �g4�s�AyyO&��[W::sam_parse1] unrecognized reference name "c1"; treated as unmapped
[W::sam_parse1] unrecognized reference name "c1"; treated as unmapped
samtools: cram/cram_io.c:3195: void cram_ref_decr_locked(refs_t *, int): Assertion `r->ref_id[id]->count == 0' failed.
Aborted (core dumped)