bcgsc / RNA-Bloom

:hibiscus: reference-free transcriptome assembly for short and long reads
Other
96 stars 7 forks source link

Long reads genome guided #76

Open sagnikbanerjee15 opened 3 months ago

sagnikbanerjee15 commented 3 months ago

Hello,

I have implemented RNA-Bloom on a subset of the PacBio bulk transcriptomic data and achieved the expected results. However, I noticed that some steps in RNA-Bloom take a considerable amount of time to execute. Before running RNA-Bloom on the entire dataset, I'd like to discuss potential methods to speed up the execution.

One idea is to align the reads to the genome, extract the aligned reads from specific non-overlapping regions of the reference, and then supply those to RNA-Bloom. Essentially, I am considering adopting the genome-guided strategy used by Trinity. Do you think this approach could help accelerate the process? Additionally, would adjusting certain parameters be beneficial since we wouldn't need to compare all reads to each other anymore?

Thank you

kmnip commented 3 months ago

Hi @sagnikbanerjee15 ,

I haven't used Trinity's genome-guided strategy. It would definitely help reduce total runtime and peak memory usage if the input reads were partitioned based on alignments against a reference genome. I don't think you need to change any assembly parameters for these localized assemblies.

Hope that answers your questions!