Closed jmarshall closed 8 years ago
Yes they're uppercase. An example from cram_dump:
Tag encoding map:
SMc => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 53}
QTZ => BYTE_ARRAY_STOP {9, 56}
BCZ => BYTE_ARRAY_STOP {9, 50}
XAZ => BYTE_ARRAY_STOP {9, 60}
ahc => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 64}
XCc => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 61}
XGc => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 51}
AMc => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 52}
XMc => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 54}
a3c => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 49}
XOc => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 55}
X0s => BYTE_ARRAY_LEN {3, 4, 1, 2, 1, 0, 1, 1, 58}
X0c => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 47}
X0C => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 63}
X1c => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 59}
X1C => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 62}
X1s => BYTE_ARRAY_LEN {3, 4, 1, 2, 1, 0, 1, 1, 48}
XTA => BYTE_ARRAY_LEN {3, 4, 1, 1, 1, 0, 1, 1, 57}
Thanks for spotting the typo. I'll fix it.
PS. Yes, there is some oddity there of X1c and X1C when they can probably be forcibly merged. Cramtools I think does a better job of rationalising these where appropriate (perhaps due to some common sanitizer in htsjdk). It's a side issue though.
In the v3 spec (and similarly in the v2.1 spec), §8.5 (Slice header block) says that tag types are the same as BAM (
[AfZHcCsSiIB]
), but the example in §8.4 (Encoding tags) hasHowever the string tag type is uppercase
Z
, and OQ isOQ:Z
in BAM. Hopefully this is just a typo in the spec text, and tags appear as uppercaseZ
in actual CRAM files…?