IUPAC ambiguity codes - Githubissues

nh13 / TMAP

Torrent Mapping Alignment Program

GNU General Public License v2.0

19 stars 0 forks source link

there's a paragrah describing how to handle abiguous DNA bases, but I don't understand why R is converted to C, can you explain this?

Ambiguous IUPAC codes in the reference/target FASTA will be converted to the lexico- graphically smallest DNA base that is not compatible to the IUPAC code to ensure mini- mum reference bias. For example, an IUPAC base R, which represents an A or a G, will be converted to a C. All Ns in the reference will be converted to As. Furthermore, any non- IUPAC character will be treated as an N. The ambiguity codes will only be re-considered when calculating the NM and MD SAM record optional tags.

nh13 / TMAP

IUPAC ambiguity codes #1