skoren / triobinningScripts

Scripts to reproduce TrioBinning manuscript
17 stars 5 forks source link

casting obj to str to prevent obj vs str comparison #1

Closed Arkarachai closed 6 years ago

Arkarachai commented 6 years ago

We recommend casting the rec.seq into a string. Although rec.seq can be sliced or operate like string, the comparison of this Obj to the corresponding string actually return False. For rc kmer, this is fine because the reverse_complement returns string. However, any kmers that were returned from get_cannonical as original kmer will not be able to match with hapCounts. Consequentially, the kmers that can contribute to the classification of reads were roughly reduced by half.

skoren commented 6 years ago

I think this depends on the version of biopython, the newer versions of biopython use string comparisons for Seq objects. I'm sure our runs used both forward and reverse kmers for classification since we tested it (and I confirmed identical outputs before/after the above change). However, no reason to not explicitly always request a string type.