Added a quick guide for installation to TROUBLESHOOTING.md. Commit Detail
š Update
Preprocess
Updated input_validator.py: The UCSC Blat server sometimes returns a 200 HTTP status code even when an error occurs. In such cases, "Very Early Error" is indicated in the title. Therefore, we have made it so that it returns False in those situations. Commit Detail
Simplified homopolymer_handler.py for error detection using cosine similarity. Commit Detail
Updated mutation_extractor.py to use cosine similarity to filter dissimilar loci. Commit Detail
Updated the mutation_extractor.identify_dissimilar_loci so that it unconditionally returns True if the 'sample' shows more than 5% variation compared to the 'control'. Commit Detail
Added preprocess.midsv_caller.convert_consecutive_indels_to_match: Due to alignment errors, instances where a true match is mistakenly replaced with "insertion following a deletion" are corrected. For example, "=C,=T" mistakenly replaced by "-C,+C|=T" is reverted back to "=C,=T". Commit Detail
Classification
Added allele_merger.merge_minor_alleles to reclassify alleles with fewer than 10 reads to suppress excessive subdivision of alleles. Commit Detail
Clustering
Added the function merge_minor_cluster to revert labels clustered with fewer than 10 reads back to the previous labels to suppress excessive subdivision of alleles. Commit Detail
Updated generate_mutation_kmers to consider indices not registered in mutation_loci as mutations by replacing them with "@". For example, "=G,=C,-C" and "=G,=G,=C" become "@,@,@" in both cases, making them the same and ensuring they do not affect clustering. Commit Detail
Consensus
Implemented LocalOutlierFactor to filter abnormal control reads. Commit Detail
v0.3.6 (2024-01-10)
š Documentation
š Update
Preprocess
Updated
input_validator.py
: The UCSC Blat server sometimes returns a 200 HTTP status code even when an error occurs. In such cases, "Very Early Error" is indicated in the title. Therefore, we have made it so that it returns False in those situations. Commit DetailSimplified
homopolymer_handler.py
for error detection using cosine similarity. Commit DetailUpdated
mutation_extractor.py
to use cosine similarity to filter dissimilar loci. Commit DetailUpdated the
mutation_extractor.identify_dissimilar_loci
so that it unconditionally returns True if the 'sample' shows more than 5% variation compared to the 'control'. Commit DetailAdded
preprocess.midsv_caller.convert_consecutive_indels_to_match
: Due to alignment errors, instances where a true match is mistakenly replaced with "insertion following a deletion" are corrected. For example, "=C,=T" mistakenly replaced by "-C,+C|=T" is reverted back to "=C,=T". Commit DetailClassification
allele_merger.merge_minor_alleles
to reclassify alleles with fewer than 10 reads to suppress excessive subdivision of alleles. Commit DetailClustering
Added the function
merge_minor_cluster
to revert labels clustered with fewer than 10 reads back to the previous labels to suppress excessive subdivision of alleles. Commit DetailUpdated
generate_mutation_kmers
to consider indices not registered in mutation_loci as mutations by replacing them with "@". For example, "=G,=C,-C" and "=G,=G,=C" become "@,@,@" in both cases, making them the same and ensuring they do not affect clustering. Commit DetailConsensus
LocalOutlierFactor
to filter abnormal control reads. Commit Detail