Closed zztin closed 2 years ago
Thanks for pointing out this. I just figured out that it is a bug in the AveMatch calculation. It is fixed in the new version.
For your case, there are only 2 unit sequences to generate the consensus sequence. As no real "consensus" can be obtained from 2 sequences. Or we can say, any unit sequence can be considered as "consensus". Thus we choose to simply consider the first unit as the consensus sequence.
We calculated the AveMatch based on the alignment result between unit sequence and the consensus using the ksw2 library in the global alignment mode. The # of matched bases is provided by the ksw2 alignment result.
What do you mean by "set AveMatch score as a parameter"?
Yan
Issue description
The AveMatch score is 0.0 in some consensus segments, which should not be correct. Used version: v.1.4.1
Steps to reproduce the issue
./bin/TideHunter
produces a consensus here which has 0.0 for the AveMatch field readname:>score_0_rep0_1281_387_1281_374_2.4_0.0_0_417,791,1169
./bin/TideHunter -u
produces output hereWhat's the expected result?
AveMatch should be >0 intuitively. The definition AveMatch is "# matched bases / unit length" Length of sub0 = 374, sub1 = 379, consensus = 374
I used python module Bio.pairwise2 to quickly calculate the global alignment score.
output is in the format of (seqA, seqB, score, begin, end) while the score = 292.0, begin = 0, end = 461
Possible expected results are:
292/461 = 0.63
292/374 = 0.78
Something else, but probably not 0.0
What's the actual result?
AveMatch = 0.0 full result here which has 0.0 for the AveMatch field readname:
>score_0_rep0_1281_387_1281_374_2.4_0.0_0_417,791,1169
Additional findings:
I found that the consensus sequence (in step2 above) is the same as the first subunit.
>score_0_rep0_sub0
Comments
If you could provide a more detailed algorithm of how you calculate the AveMatch would be nice. What is the current lower limit of the reported consensus for AveMatch? Also, if it is possible to set AveMatch score as a parameter in the command line? Thank you very much!