tomazc / iCount

iCount, protein-RNA interaction analytics
http://icount.biolab.si
Other
23 stars 26 forks source link

Question: iCount xlsites expected runtime #191

Open mirax87 opened 5 years ago

mirax87 commented 5 years ago

Hi,

in order to process our D. melanogaster iCLIP library, I used snakemake to put the iCount steps together and integrated benchmarking, specifically for iCount xlsites with quantification based on cDNA and reads.

Here, I am observing runtimes of ~1 - 4 days on our cluster system for iCount xlsites. The number of reads per multiplexing barcode is quite variable, which correlates with runtime.

In terms of parameters, I use

using the output gtf from iCount segment

I wonder what - next to total number of mapped reads - determines the runtime of iCount xlsites and whether there are some useful pre-filtering strategies of the BAM files to speed up the process without losing (too much) sensitivity.

Cheers

JureZmrzlikar commented 5 years ago

Hi @mirax87 !

Are you using --segmentation input? If you do, this i the main reason that iCount xlsites is taking so long. Please run it without segmentation (AFAIK, this is the way most users do it). We should speed up the algorithm in case segmentation is given, but never found the time to do it properly

Regarding other factors that could affect runtime:

mirax87 commented 5 years ago

Hi @JureZmrzlikar,

you are right, I am using iCount xlsites --segmentation. I'll try without.

Thanks for the quick feedback. Cheers