ksahlin / strobealign

Aligns short reads using dynamic seed size with strobemers
MIT License
128 stars 16 forks source link

CIGAR is not recognized by `samtools` #289

Closed valentynbez closed 1 year ago

valentynbez commented 1 year ago

Hello,

I compiled the development version of strobealign with cmake. I run the command:

strobealign/build/strobealign -U -R 4 -N 1000 -M 1000 -t 32 genomes_chunked/FENG15-1_1.fa.gz crc_phages.100mer.fa.gz 
> FENG15-1_1.strobealign.N1000.sam

The output looks as following:

FENG15-1_SAMEA3136768_METAG-scaffold_777_phage_45_virsorter_4500-4556   272     FENG15-1_SAMEA3136632_MAG_00000057-scaffold_3   1       255     54M3S   *       0  0AGCCAGTAATGGGTTAAGTGATAACAGGTGTCTGGAAATATAGGGGCAAATCCAGCA       *       NM:i:0  AS:i:118

But the CIGAR is not recognized by samtools:

[W::sam_parse1] mapped query must have a CIGAR; treated as unmapped
marcelm commented 1 year ago

This has something to do with -N (supplementary alignments), which is not very well tested at the moment. It seems the CIGAR is set to * for all alignments when -N is used.

valentynbez commented 1 year ago

It would be a super useful feature for my project to have all alignments reported. Neither bwa, nor bowtie2 nor mmseqs2 are doing it.

ksahlin commented 1 year ago

Agreed, this is important.

marcelm commented 1 year ago

This regression was probably introduced in #235. I’ll look into this.

marcelm commented 1 year ago

I’ve opened a PR with a fix. (The problem was not introduced in #235, but in a different commit, see the PR.)

ksahlin commented 1 year ago

Thanks again @valentynbez, any feedback from your experiments are appreciated.