Update the issue template from md to yml and modify it to make it easier for users to fill out each item. [Commit Detail]
💥 Breaking
Extremely low-frequency alleles (less than 0.05%) are considered Nanopore sequence errors and are not clustered #36.
Configure clustering.extract_labels so that alleles with a low number of reads (0.05% or fewer or 5 reads or fewer) are not clustered. [Commit Detail]
Change clustering.clustering to stop if the minimum value of the elements in the cluster is 0.5% or less. [Commit Detail]
Add consensus.remove_minor_alleles to remove minor alleles with fewer than 5 reads or less than 0.5% [Commit Detail]
Save subsetted fastq of a control sample if the read number is too large (> 10,000 reads). The control will have a maximum of 10,000 reads to avoid excessive computational load. [Commit Detail]
If the read length is 500 bases or less, change the mappy preset to sr. [Commit Detail]
Update extract_best_preset to prioritize map-ont and remove splice preset if inversion is observed. [Commit Detail]
Update the algorithms of cssplits_hander.reallocate_insertion_within_deletion to automate change point detection by incorporating temporal changes. [Commit Detail]
Integrate requirements.txt and MANIFEST.in into pyproject.toml by replacing setup.py [Commit Detail]
Modify to record the execution command of DAJIN2 in the log file [Commit Detail]
Add a test to check if the version in test_version.sh matches the version in pyproject.toml and utils.config [Commit Detail]
Rename consensus.subset_clust to consensus.downsample_by_label to clarify the function's purpose. [Commit Detail]
Update extract_unique_insertions to merge highly similar extracted insertion sequences. [Commit Detail]
Fix extract_unique_insertions: There is a bug where removing the key twice in fasta_insertions_unique caused the index and key to become misaligned in enumerate(distances) if i != key. Therefore, the removal of keys from fasta_insertions_unique is now done all at once at the end. [Commit Detail]
Add control characters for fastx_handler.sanitize_filename as forbidden chars. [Commit Detail]
Changed the naming convention for the temporary directory: <sample_name>/<process_content>/<allele_name>/(<label_name>)/file_name. Example: flox/consensus/control/1/mutation_loci.pickle. [Commit Detail]
Move sanitze_name function from utils.fastx_handlerto utils.io [Commit Detail]
🐛 Bug Fixes
Removed sam_handler.remove_overlapped_reads to prevent unnecessary trimming of reads. [Commit Detail]
Fix preprocess.insertions_to_fasta.remove_minor_groups to delete the keys (insertion loci) when insertions are removed and result in an empty dict. This prevents errors when accessing non-existent keys in subset_insertions. [Commit Detail]
Fix the bug in cssplits_handler.convert_cssplits_to_cstag where the insertion cs tag is not merged with the next cs tag if they have the same operator (e.g., +A|+A|=T, =T: before: +aa=T=T, after: +aa=TT). [Commit Detail]
Modified the system to separate intermediate files using a directory structure instead of underscores (_), ensuring that no errors occur even if users use allele names containing underscores [Commit Detail]
📝 Documentation
💥 Breaking
Extremely low-frequency alleles (less than 0.05%) are considered Nanopore sequence errors and are not clustered #36.
clustering.extract_labels
so that alleles with a low number of reads (0.05% or fewer or 5 reads or fewer) are not clustered. [Commit Detail]clustering.clustering
to stop if the minimum value of the elements in the cluster is 0.5% or less. [Commit Detail]consensus.remove_minor_alleles
to remove minor alleles with fewer than 5 reads or less than 0.5% [Commit Detail]Save subsetted fastq of a control sample if the read number is too large (> 10,000 reads). The control will have a maximum of 10,000 reads to avoid excessive computational load. [Commit Detail]
If the read length is 500 bases or less, change the mappy preset to
sr
. [Commit Detail]Update
extract_best_preset
to prioritizemap-ont
and removesplice
preset if inversion is observed. [Commit Detail]Update the algorithms of
cssplits_hander.reallocate_insertion_within_deletion
to automate change point detection by incorporating temporal changes. [Commit Detail]🔧 Maintenance
Update
deploy_pypi.yml
to use the latest version of Actions. Refer to the latest official YAML for guidance. [Commit Detail]Integrate
requirements.txt
andMANIFEST.in
intopyproject.toml
by replacingsetup.py
[Commit Detail]Modify to record the execution command of DAJIN2 in the log file [Commit Detail]
Add a test to check if the version in
test_version.sh
matches the version inpyproject.toml
andutils.config
[Commit Detail]Rename
consensus.subset_clust
toconsensus.downsample_by_label
to clarify the function's purpose. [Commit Detail]Update
extract_unique_insertions
to merge highly similar extracted insertion sequences. [Commit Detail]extract_unique_insertions
: There is a bug where removing the key twice in fasta_insertions_unique caused the index and key to become misaligned in enumerate(distances) if i != key. Therefore, the removal of keys from fasta_insertions_unique is now done all at once at the end. [Commit Detail]Add control characters for
fastx_handler.sanitize_filename
as forbidden chars. [Commit Detail]Changed the naming convention for the temporary directory:
<sample_name>/<process_content>/<allele_name>/(<label_name>)/file_name
. Example:flox/consensus/control/1/mutation_loci.pickle
. [Commit Detail]Move
sanitze_name
function fromutils.fastx_handler
toutils.io
[Commit Detail]🐛 Bug Fixes
Removed
sam_handler.remove_overlapped_reads
to prevent unnecessary trimming of reads. [Commit Detail]Fix
preprocess.insertions_to_fasta.remove_minor_groups
to delete the keys (insertion loci) when insertions are removed and result in an empty dict. This prevents errors when accessing non-existent keys insubset_insertions
. [Commit Detail]Fix the bug in
cssplits_handler.convert_cssplits_to_cstag
where the insertion cs tag is not merged with the next cs tag if they have the same operator (e.g.,+A|+A|=T, =T
: before:+aa=T=T
, after:+aa=TT
). [Commit Detail]Modified the system to separate intermediate files using a directory structure instead of underscores (
_
), ensuring that no errors occur even if users use allele names containing underscores [Commit Detail]