Is there any way to run the pipeline faster?

fw262 / TAR-scRNA-seq

scRNA-seq analysis beyond gene annotations using transcriptionally active regions (TARs) generated from sequence alignment data

GNU General Public License v3.0

9 stars 7 forks source link

Is there any way to run the pipeline faster? #14

Open genecell opened 2 years ago

genecell commented 2 years ago

Hi,

Thank you for this helpful tool! I found it very slow to run on a single 10x scRNAseq dataset. I used 16 cores. After almost three days, I only get chr3.bed after generating_split.sorted.bed.gz. So I think it will be at least 14 or 21 days for the pipeline to finish, which is not affordable. Could you please provide any good suggestion? Thank you!

Best, Min

fw262 commented 2 years ago

Hi Min,

If you have a large dataset, you would have to down sample the combined bam file. If you look in the Snakefile line 253, you can uncomment that rule to down sample the combined bam file from which TARs are derived from.

I would recommend down sampling that bam file to something 10-50 GB in size.

Best regards, Michael