Open KatrinMoller opened 2 months ago
Hi, how large are your input samples?? We have a few efficiency improvements on the way for the map transcriptome step which may help a bit but also 64 threads seems high this is per process not for the whole programme - how many does your system have in total, perhaps try 8 or 16?
We are investigating the second issue, thanks for reporting.
Hi @sarahjeeeze Thanks for looking into this My samples (6 in total) are between 30-50GB each. My system has 64 threads, so thats what I put in the initial command, is there a possibility to change this also for individual steps?
Hi, yes you can set per process with the threads parameter - this sets it for any steps where adjusting threads should improve performance. But if you give one process all 64 threads it will slow the workflow as there wont be any left for other processes and also potentially steal all the memory. So i recommend 8/16max.
Still looking in to the other issue
Hi, I ma getting the same error (here mentioned as second issue). Any news on that?
Hi, sorry yes got a MR fix incoming for it, will let you know once its in pre-release. Sorry for the delay.
Operating System
Windows 10
Other Linux
No response
Workflow Version
v1.1.1
Workflow Execution
Command line
EPI2ME Version
No response
CLI command run
./nextflow run epi2me-labs/wf-transcriptomes \ -profile singularity \ --fastq /hpcdata/Mimir/shared/km100/all_libs \ --de_analysis \ --ref_genome Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz\ --ref_annotation Homo_sapiens.GRCh38.110.gtf \ --sample_sheet sample_sheet_short.csv \ --cdna_kit "SQK-PCS111" \ --isoform_table_nrows 10000 \ --out_dir output_short -w workspace_short\ --threads 64
Workflow Execution - CLI Execution Profile
singularity
What happened?
I have run this analysis with a ref_transcriptome successfully. But I wanted to try the reference guided version, as I am searching for a poorly annotated isoform. The run goes well until the differential_expression:map_transcriptome, then it started taking ages (up to 19hours per sample) which I guess could be because of the reference guided part, but perhaps worth taking a look at if the 64 CPUs are being used or not. It then runs again smoothly until the command deAnalysis and then it gives an error (see below), which looks like the transcript strand direction is missing? Could you help me solve this?
Relevant log output
Application activity log entry
No response
Were you able to successfully run the latest version of the workflow with the demo data?
yes
Other demo data information
No response