shortRNAhub / shortRNA

short RNA-seq analysis package
GNU General Public License v3.0
1 stars 2 forks source link

Comparison with existing tools #39

Open dktanwar opened 3 years ago

dktanwar commented 3 years ago

Seqpac

MirMaster 2.0

dktanwar commented 3 years ago

Seqpac vs shortRNA

Comparison in terms of speed

Trimming

fastp is 3 times faster than cutadapt: https://doi.org/10.1093/bioinformatics/bty560

Alignment

From seqPac paper: A likely reason for Bowtie’s popularity in sRNA community is because it is reliable with short sequence alignments. For instance, we initially tried to integrate the Rsubreads package (61) in seqpac’s workflow, which applies a highly efficient ‘seed-and-vote’ mapping algorithm. However, for certain read lengths we consistently experienced failure to correctly vote for the best alignment, possibly as a consequence that too few seeds were covering the read. We will off-course explore more efficient alternatives to Bowtie in the future.

From Rsubread paper (https://doi.org/10.1093/nar/gkz114): "QuasR is however an interface to C programs from 2010 or earlier, specifically to Bowtie version 1.1.1 (18), SpliceMap 3.3.5.2 (19) and SeqAn 1.1 (20). These older tools do not reflect the considerable improvements in algorithms achieved during the last 8 years."

This raises a flag to re-investigate the choice of alignment for us. However, we can have option of both aligners: Rbowtie and Rsubread.

Comparison in terms of:

QC

Annotation

Reads assignment of multi-mapping reads

In principle, shortRNA has advantage here for the reads assignment.

Framework

Summary

seqPac seems to be one of the best tools out now for sRNA-seq data analysis but from the above comparisons we could say that shortRNA is still better in terms of speed, features offered and the interactive and informative plots one would be able to make using the shortRNA.

Please also check: https://dktanwar.github.io/PhD/PR/20210622/20210622_PR_Deepak_Tanwar (slides 57 to 59)

plger commented 3 years ago

There are clearly important similarities: they use sequence-based counting like we do, and offer some end-user functionalities we aimed at (e.g. the coverage plots), but beside that the package doesn't go much beyond what was already out there, and there's nothing anywhere near the most critical features of our approach, i.e. the tree-based assignment and hypothesis testing... it feels to me like a good example to use to ensure we've got everything that's offered elsewhere.

dktanwar commented 2 years ago

A number of tools here: https://tools4mirs.org/

dktanwar commented 1 year ago

Comparison for tRNAs: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-022-04691-1