pezmaster31 / bamtools

C++ API & command-line toolkit for working with BAM data
MIT License
418 stars 153 forks source link

Possible error #145

Open kemin711 opened 7 years ago

kemin711 commented 7 years ago

int tagval=-1; When I use GetTag("NM", tagval); The function is stuck at line: if ( !TagTypeHelper::CanConvertFrom(type) in BamAlignment.h

When I looked at the TagData it was like: NMC^AMDZ98T2

const char type = *(pTagData - 1);

The type variable has a value of 'C' instead of 'i'. I am using the latest version of bwa. Looking into the code of bwa, it is using the type i. The MD string is properly typed as Z.

Not sure this is caused by some modification from my part or not. Any one can give a pointer as to how to fix this bug? I will do the fixing.

kemin711 commented 7 years ago

so far my temporary fix is like the following in the GetTag() function:

if (type == 'C' && (tag == "NM" || tag == "AS" || tag == "XS")) { // patch a bug
   //type = Constants::BAM_TAG_TYPE_INT32;
  //cout << __FILE__ << ":" << __LINE__ << ": pTagData: "
  //   << pTagData << endl;
  //cerr << "NM int val at pTagData: " << (int)(*pTagData) << " at 2 bytes later: "
  //   << (int)(*(pTagData+2)) << endl;
   // this is a short term patch for the bug
  destination=(int)(*pTagData);
  return true;
}
else if ( !TagTypeHelper<T>::CanConvertFrom(type) ) {
    // TODO: set error string ?
   cerr << __FILE__ << ":" << __LINE__ << ":"
      << "Failed to convert to " << typeid(destination).name()
      << " from " << type << endl;
    return false;
}

This is a patch, not a real fix.

angadps commented 7 years ago

I used the GetTagType function (http://pezmaster31.github.io/bamtools/struct_bam_tools_1_1_bam_alignment.html#ac9cac21df9ef4a0a3414387b97cf14da) to identify the 'c' type (http://pezmaster31.github.io/bamtools/_bam_constants_8h.html) and ended up using int8_t to read the NM tag.

kemin711 commented 7 years ago

Does the bam definition have C type?

douglasgscofield commented 6 years ago

Yes, see Section 1.5 of http://samtools.github.io/hts-specs/SAMv1.pdf. Integer types on tags (e.g., XS:i:20 in SAM format) may be encoded into the smallest integer that can hold the integer value. In the case of XS:i:20, this was encoded as XSC\024. The C indicates a uint8_t. Use GetTagType() to figure out the tag type (for C, the value Constants::BAM_TAG_TYPE_UINT8 will be returned) and then GetTag() to store it in an appropriate variable.