Open bw2 opened 1 year ago
@bw2 For better or worse, htsjdk tries to maintain round-trip fidelity for CRAMs. I took a look at the first few slices of the CRAM referenced above, and it does not appear to contain NM or MD tags. Can you let me know how you concluded that it does ?
Affected tool(s) or class(es)
gatk DownsampleSam
Affected version(s)
GATK v4.3.0.0
Description
Input cram file (gs://broad-public-datasets/CHM1_CHM13_WGS2/CHM1_CHM13_WGS2.cram) has NM tags, but the downsampled output file no longer has them. My command-line is
Some downstream tools require NM tags, so I have to run
samtools calmd CHM1_CHM13_WGS2.downsampled.bam /hg38.fa
to re-add it.