Change: Method of counting indels modified to use only matches as the denominator, instead of matches + indels.
Reason: To specifically focus on the occurrence rate of particular mutations.
find_dissimilar_indices:
Change: Mutation detection modified. If the p-value remains < 0.05 after removing the target base sequence, the area is not detected as a mutation, assuming the significance is due to other parts.
Implication: Increases mutation detection accuracy by excluding irrelevant base sequences.
merge_index_of_consecutive_indel:
Change: Merged merge_surrounding_index and merge_index_of_consecutive_insertions into a single function.
Benefit: Streamlines the process and enhances efficiency in handling consecutive indels.
Addressed a precision issue in floating-point calculations where N equals 100%, leading to 100 != 100.000002. Changed the condition to "having only one key and that key being N". Commit details
Update mutation_extractor.py:
Switched to the Wilcoxon signed-rank test due to false negatives in the t-test for data with peak-like shapes. Commit details
Others
Modified batch processing to run on a single CPU thread per process.
📖 Documentation
✨ New Features
🔧 Maintenance
Update
preprocess.mutation_extractor.py
count_indels
:find_dissimilar_indices
:merge_index_of_consecutive_indel
:merge_surrounding_index
andmerge_index_of_consecutive_insertions
into a single function.Commit details
Update
consensus.consensus.py
:100 != 100.000002
. Changed the condition to "having only one key and that key beingN
". Commit detailsUpdate
mutation_extractor.py
:Others
clust_formatter.cache_mutation_loci
.mutation_extractor.merge_loci
to use union instead of intersection.insertions_to_fasta.py
.insertion_to_fasta.save_fasta
toutils.io.save_fasta
.