malariagen / anospp-analysis

Python package for ANOSPP data analysis
MIT License
0 stars 0 forks source link

Fix NN bugs #18

Open mariloubodde opened 1 year ago

mariloubodde commented 1 year ago

Fix bugs identified in first anospp production run

mariloubodde commented 1 year ago

The bug is in identify_error_seq()

It occurs when a sample has only unique (to the run) haplotypes at an amplicon, which meet the distance threshold to be identified as error sequences. Then the first sequence will be identified as an error sequence and in the previous code the second sequence cannot be compared to anything anymore, which causes the error.

I'm not sure what the right thing to do is, I have now settled for removing both sequences if they are unique to the run and sufficiently different from each other.

amakunin commented 7 months ago

@mariloubodde was this fixed in #20 ?

mariloubodde commented 7 months ago

@amakunin yes!