BrooksLabUCSC / flair

Full-Length Alternative Isoform analysis of RNA
Other
205 stars 71 forks source link

Slow processing speed for the large number of samples (collapse stage) #224

Closed hd00ljy closed 1 year ago

hd00ljy commented 1 year ago

Hello,

I have >50 RNA-seq samples to run and at the flair collapse stage I merged all samples as suggested in the manual. That is, I concatenated all the corrected.bed file into a single file

But it took too much time to process all samples on 28 cores 500gb RAM

I have 2 questions regarding this,

Q1. What is your suggestion for large number of samples?

Q2. Do you have plans to make options for cluster computing? (such as PBSpro, SGE, Slurm) At least for the "collapse" stage.

Thank you Jin-Young

hd00ljy commented 1 year ago

Oh, sorry. I just found out on the collapse range bash script you provided. it would also possible to submit to PBS using this

Thank you!

Jeltje commented 1 year ago

Great! Please create a new ticket if that causes any issues, we have recently been streamlining the code but didn't get to that section yet. An alternative to creating ranges is splitting by chromosome, which is fairly easily to do manually beforehand.