In the current DAJIN2, deletions due to sequencing error within large deletions are considered true mutations.
While this behavior is somewhat accurate as it reflects mutations within large deletion alleles, it leads to an undesirable outcome where the clustering detects deletions of sequence errors within large deletions and reports them as independent alleles.
Describe the solution you'd like if you have any.
Similar to insertions_to_fasta.py, we aim to classify deletions within large deletion alleles during the classification step before clustering by pre-separating the large deletion alleles from the control.
Describe the features you want.
In the current DAJIN2, deletions due to sequencing error within large deletions are considered true mutations.
While this behavior is somewhat accurate as it reflects mutations within large deletion alleles, it leads to an undesirable outcome where the clustering detects deletions of sequence errors within large deletions and reports them as independent alleles.
Describe the solution you'd like if you have any.
Similar to
insertions_to_fasta.py
, we aim to classify deletions within large deletion alleles during theclassification
step beforeclustering
by pre-separating the large deletion alleles from the control.DAJIN2 version
0.4.6
Additional context
None