MikeAxtell / ShortStack

ShortStack: Comprehensive annotation and quantification of small RNA genes
MIT License
88 stars 29 forks source link

Bug: some reads are not placed in clusters properly according to pad/mincov #98

Closed ceiabreu closed 1 year ago

ceiabreu commented 4 years ago

Dear Mike,

We found that a small fraction of reads are not properly placed in clusters. I have a small reproducible example for you.

ShortStack --nohp --pad 1 --mincov 1 --readfile reads.fasta --genomefile genome.fasta

generates these wrongly split clusters (pasting from Counts.txt file): chr_1:17223-17246 Cluster_6 9 9 chr_1:17243-17263 Cluster_7 1 1

chr_2:333-355 Cluster_14 46 46 chr_2:356-377 Cluster_15 2 2

Hope this helps tracking it down.

Thanks,

Cei

ss_bug.zip

MikeAxtell commented 3 years ago

Thanks Cei, sorry to be so slow to respond. I am working through all of the open issues now, finally.

I have begun work on a major new release and I anticipate that the cluster-calling algorithm will be completely re-written. Which hopefully will solve whatever weirdness is going on here.

MikeAxtell commented 1 year ago

A large upgrade of ShortStack, version 4, is now in alpha testing. It is available if you want to test it on the ShortStack4 branch on GitHub. We are still doing some tests and optimizations; when it's ready we will merge to master and make a release. The new version has a re-written method for cluster calling that should fix this bug.