samtools / htslib

C library for high-throughput sequencing data formats
Other
798 stars 445 forks source link

htslib setting 'unmapped' flag when no cigar string #430

Open jenniferliddle opened 7 years ago

jenniferliddle commented 7 years ago

When htslib reads a BAM record, if there is no cigar string the 'unmapped' flag is set and a warning message sent to stderr. This is contrary to the SAM specification, which says that

"Bit 0x4 is the only reliable place to tell whether the read is unmapped. If 0x4 is set, no assumptions can be made about RNAME,POS,CIGAR,MAPQ"

It's not clear to me if changing this behaviour would cause more problems than it solves.

jkbonfield commented 7 years ago

For what it's worth, the modification of FLAG dates back to this commit: https://github.com/samtools/samtools/commit/140d53dfdfe32bfa3ae4b24f1f33d071f366054f and the notion of it being something to warn about samtools/samtools@a8c861864486f4c30dc664e3378ffd6e021b6364.

Both of these are 0.1.5, prior to the htslib split. However just because it's been doing this for years doesn't necessarily make it valid! My own view is it's an error in the implementation.