Closed PolinaBevad closed 5 years ago
@PolinaBevad Whether or not the NM/MD tags are returned through SAMRecord depends on whether they were serialized out in the CRAM you're reading. That in turn depends on the policy of the CRAM writer used to create the CRAM.
Older versions of the htsjdk CRAM writer didn't serialize these tags on write, but more recent versions do if they're present in the SAMRecord at the time its written to CRAM. So I would expect that if you wrote a CRAM using a recent htsjdk, from a BAM that contains NM/MD tags, they would round trip and you should get them back on read. But if they're not present in the CRAM, you'd have to regenerate them.
Thank you, everything is clear now.
Hi all! I know that CRAM file doesn't store NM and MD tags because they can be calculated from reference and read sequence. But what about
SAMRecord
class? I createSamReader
throughSamReaderFactory
and use it to getSAMRecord
, but when I try to get NM tag fromSAMRecord
for read of CRAM file, it returns null. Also,SAMRecord.getSAMString()
returns it without NM and MD tags.Is there any other way to get NM tag except manually invoking
calculateSamNmTag()
fromSequenceUtil
? I thought thatSAMRecord
will be identical tosamtools
behavior (samtools
shows NM and MD tags for CRAM). Also I read a discussion about BAM-CRAM-BAM conversion and it seems that it must be lossless (https://github.com/samtools/htsjdk/issues/483).Thank you! Environment: htsjdk 1.19.0, openjdk java 1.8.0.66