Closed amnona closed 6 years ago
Fixed comments also, changed the default indel upper bound to 0.02 (instead of 0.01). This should make results more compatible with the previous version (since it used the mismatch bound which is 0.02 for 2 mismatches, which is what we get if the indel is not recognized).
Since all deblur testing and usage was done with the bug, updating the indel probability to 0.02 makes the results more compatible. Also, 0.02 is a more conservative upper bound.
On Wed, Aug 29, 2018 at 7:09 PM Daniel McDonald notifications@github.com wrote:
@wasade commented on this pull request.
In ChangeLog.md https://github.com/biocore/deblur/pull/179#discussion_r213739276:
@@ -3,6 +3,8 @@
Version 1.0.4-dev
Bug fixes
+* Fixed problem causing deblur to ignore the indel error uper bound and use the mismatch error upper bound instead. This caused deblur to use the mismatch depenent default error rate instead of the default indel error rate (which is constant for up to 3 indels). Also increased the default indel error rate to 0.02. See issue #178 for more details.
Is there empirical data to support the change?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/pull/179#pullrequestreview-150637469, or mute the thread https://github.com/notifications/unsubscribe-auth/AFkA8peVQfONnVrINTjgE_Svvueb-YHrks5uVryfgaJpZM4V1d97 .
after discussing with @wasade we are gonna merge and then fix in another pr ...
Fix the sub_seq_j = seq_i.np_sequence[:length] typo bug to seq_j.
Also, fix the excessive '-' at end of alignment being counted as mismatch when handling the indel.
Also fix the unit test, and validate the indel removal using the indel error rate and not the general mismatch error rate (by adding an indel containing sequence that should not be removed under the indel error rate, but will be removed with the mismatch error rate).