epi2me-labs / wf-transcriptomes

Other
64 stars 30 forks source link

Minimap2 running on a single core #55

Closed KatrinMoller closed 4 months ago

KatrinMoller commented 5 months ago

Operating System

Windows 10

Other Linux

No response

Workflow Version

v1.0.0

Workflow Execution

Command line

EPI2ME Version

No response

CLI command run

OUTPUT=~/output; ./nextflow run epi2me-labs/wf-transcriptomes \ -profile singularity \ --fastq /hpcdata/Mimir/shared/km100/all_libs \ --de_analysis \ --ref_genome Homo_sapiens.GRCh38.dna_rm.primary_assembly.fa.gz\ --ref_annotation Homo_sapiens.GRCh38.110.gtf.gz \ --ref_transcriptome Homo_sapiens.GRCh38.cdna.all.fa.gz\ --sample_sheet sample_sheet.csv \ --cdna_kit "SQK-PCS111" \ --isoform_table_nrows 10000 \ --out_dir outdir -w workspace_dir \ --threads 64

Workflow Execution - CLI Execution Profile

singularity

What happened?

I am running this pipeline with a rather large dataset (ca 400Gb), so I wanted to utilise all the 64 cores available to me on the server that I am using. One of the last versions of this pipeline has thankfully introduced the option of -threads to determine how many cores should be used. The pipeline was running quite smoothly until it finished Then it started the and has now, 5 days later, only finished 3 out of the 10 samples. When I checked, this process seems to only be using 1 core. Is this a bug in the pipeline or are there an additional command I need to use to get minimap2 to run on multiple cores? Since the process is still running, I am pasting the contents of the "trace" file in here

Relevant log output

task_id hash    native_id   name    status  exit    submit  duration    realtime    %cpu    peak_rss    peak_vmem   rchar   wchar
5   aa/50ceed   23235   pipeline:getParams  COMPLETED   0   2024-01-09 09:55:09.893 827ms   10ms    102.9%  0   0   63 KB   2.4 KB
6   68/5be5e2   23226   pipeline:decompress_transcriptome   COMPLETED   0   2024-01-09 09:55:09.873 4.6s    3.8s    77.2%   4.3 MB  7.1 MB  75.3 MB 428.7 MB
1   84/2be3ec   23243   validate_sample_sheet   COMPLETED   0   2024-01-09 09:55:09.900 8s  6.8s    123.7%  115.1 MB    8.5 GB  21 MB   24.9 KB
8   95/09032a   25459   pipeline:preprocess_ref_transcriptome   COMPLETED   0   2024-01-09 09:55:14.492 4.5s    3.2s    45.2%   3.4 MB  6.4 MB  428.7 MB    428.7 MB
4   4f/59926c   23229   pipeline:getVersions    COMPLETED   0   2024-01-09 09:55:09.880 10.6s   9.5s    275.4%  82.6 MB 8.5 GB  39.4 MB 3 KB
7   29/5dfebd   23251   pipeline:differential_expression:checkSampleSheetCondition  COMPLETED   0   2024-01-09 09:55:09.905 11.3s   10.2s   105.0%  163 MB  8.5 GB  35.6 MB 34.9 KB
3   71/8cd2cb   23232   pipeline:decompress_annotation  COMPLETED   0   2024-01-09 09:55:09.887 12.5s   11.7s   55.0%   4.2 MB  7.1 MB  51.9 MB 1.4 GB
20  14/e7bde0   33655   pipeline:preprocess_ref_annotation  COMPLETED   0   2024-01-09 09:55:23.375 10.1s   9.4s    54.6%   3.4 MB  6.4 MB  1.4 GB  1.4 GB
2   84/d999d1   23224   pipeline:decompress_ref COMPLETED   0   2024-01-09 09:55:09.864 31.7s   30.8s   67.2%   4.3 MB  7.1 MB  458 MB  2.9 GB
18  29/a984a5   27316   fastcat (10)    COMPLETED   0   2024-01-09 09:55:18.736 24m 13s 24m 12s 166.0%  63.6 MB 1.4 GB  83.7 GB 78.8 GB
23  a0/6efc35   227523  pipeline:collectFastqIngressResultsInDir (1)    COMPLETED   0   2024-01-09 10:19:33.027 714ms   62ms    88.0%   2.6 MB  3.8 MB  80 KB   640 B
17  b0/15a301   27208   fastcat (9) COMPLETED   0   2024-01-09 09:55:18.643 28m 35s 28m 34s 166.0%  62.1 MB 1.4 GB  100.4 GB    94.5 GB
25  51/e04bd6   240738  pipeline:collectFastqIngressResultsInDir (2)    COMPLETED   0   2024-01-09 10:23:55.973 1.5s    182ms   28.8%   2.7 MB  3.8 MB  80 KB   639 B
16  9c/a696df   27157   fastcat (8) COMPLETED   0   2024-01-09 09:55:18.584 28m 47s 28m 47s 165.9%  59.5 MB 1.4 GB  103.8 GB    97.4 GB
26  e4/f6df7f   242726  pipeline:collectFastqIngressResultsInDir (3)    COMPLETED   0   2024-01-09 10:24:07.097 707ms   47ms    107.3%  0   0   80 KB   641 B
15  27/4e5b2d   27038   fastcat (7) COMPLETED   0   2024-01-09 09:55:18.458 30m 55s 30m 54s 165.7%  61.1 MB 1.4 GB  109.1 GB    102.5 GB
29  3c/539bf4   251415  pipeline:collectFastqIngressResultsInDir (4)    COMPLETED   0   2024-01-09 10:26:14.390 723ms   52ms    87.9%   2.8 MB  3.8 MB  80 KB   639 B
14  01/5962cf   27100   fastcat (6) COMPLETED   0   2024-01-09 09:55:18.519 33m 37s 33m 36s 166.1%  61.3 MB 1.4 GB  119.2 GB    112 GB
31  d4/81ab77   263133  pipeline:collectFastqIngressResultsInDir (5)    COMPLETED   0   2024-01-09 10:28:56.640 719ms   52ms    75.0%   0   0   80 KB   640 B
10  27/a1a69d   26832   fastcat (2) COMPLETED   0   2024-01-09 09:55:18.190 35m 47s 35m 46s 166.0%  61.6 MB 1.4 GB  123.5 GB    116.3 GB
33  56/e7a0ea   274846  pipeline:collectFastqIngressResultsInDir (6)    COMPLETED   0   2024-01-09 10:31:06.331 848ms   60ms    84.4%   0   0   80 KB   640 B
11  ee/cbe446   26908   fastcat (3) COMPLETED   0   2024-01-09 09:55:18.247 36m 28s 36m 27s 166.6%  61 MB   1.4 GB  130.8 GB    122.6 GB
35  69/158bcc   280419  pipeline:collectFastqIngressResultsInDir (7)    COMPLETED   0   2024-01-09 10:31:46.811 843ms   57ms    79.4%   2.7 MB  3.8 MB  80 KB   639 B
9   a5/6c03ed   26680   fastcat (1) COMPLETED   0   2024-01-09 09:55:18.129 36m 44s 36m 43s 167.1%  60.2 MB 1.4 GB  132.9 GB    124.5 GB
36  17/15ff30   282304  pipeline:collectFastqIngressResultsInDir (8)    COMPLETED   0   2024-01-09 10:32:02.901 804ms   63ms    83.3%   2.8 MB  3.8 MB  80 KB   639 B
12  0e/fc8d80   26937   fastcat (4) COMPLETED   0   2024-01-09 09:55:18.302 37m 50s 37m 49s 165.0%  60.9 MB 1.4 GB  131.8 GB    123.9 GB
38  c0/799039   292992  pipeline:collectFastqIngressResultsInDir (9)    COMPLETED   0   2024-01-09 10:33:09.097 755ms   52ms    83.3%   0   0   80 KB   640 B
13  5b/fd8306   27005   fastcat (5) COMPLETED   0   2024-01-09 09:55:18.395 40m 18s 40m 18s 165.5%  66 MB   1.4 GB  136 GB  128.2 GB
41  52/5f5b2c   314053  pipeline:collectFastqIngressResultsInDir (10)   COMPLETED   0   2024-01-09 10:35:37.772 727ms   53ms    83.4%   2.8 MB  3.8 MB  80 KB   639 B
22  97/6de7e5   227525  pipeline:preprocess_reads (1)   COMPLETED   0   2024-01-09 10:19:33.044 7h 49m 54s  7h 49m 53s  272.3%  15.2 GB 547.2 GB    144.4 GB    123.4 GB
24  75/f3b57c   240726  pipeline:preprocess_reads (2)   COMPLETED   0   2024-01-09 10:23:55.126 9h 25m 32s  9h 25m 31s  273.8%  14.9 GB 547 GB  171.5 GB    146.9 GB
27  5f/50b5f8   242728  pipeline:preprocess_reads (3)   COMPLETED   0   2024-01-09 10:24:07.113 9h 55m 39s  9h 55m 38s  282.6%  15.1 GB 547.1 GB    178.2 GB    152.7 GB
28  98/8c660e   251417  pipeline:preprocess_reads (4)   COMPLETED   0   2024-01-09 10:26:14.404 10h 25m 37s 10h 25m 36s 265.0%  15.1 GB 547.1 GB    185.1 GB    158.2 GB
30  4b/e2b3ca   263135  pipeline:preprocess_reads (5)   COMPLETED   0   2024-01-09 10:28:56.655 11h 17m 13s 11h 17m 13s 281.8%  15.5 GB 547.3 GB    203.9 GB    171.1 GB
32  8c/ae5bf7   274849  pipeline:preprocess_reads (6)   COMPLETED   0   2024-01-09 10:31:06.343 11h 42m 40s 11h 42m 39s 259.6%  15 GB   547 GB  209.3 GB    178.9 GB
34  55/97cc51   280421  pipeline:preprocess_reads (7)   COMPLETED   0   2024-01-09 10:31:46.830 12h 40m 58s 12h 40m 57s 281.7%  15.2 GB 547.1 GB    224 GB  192 GB
39  e6/63926a   292994  pipeline:preprocess_reads (9)   COMPLETED   0   2024-01-09 10:33:09.110 12h 43m 53s 12h 43m 52s 264.2%  14.8 GB 546.9 GB    224.1 GB    190.8 GB
37  66/204944   282302  pipeline:preprocess_reads (8)   COMPLETED   0   2024-01-09 10:32:02.882 13h 29m 24s 13h 29m 23s 267.0%  15.2 GB 547.2 GB    227.1 GB    194.8 GB
40  51/70172b   314055  pipeline:preprocess_reads (10)  COMPLETED   0   2024-01-09 10:35:37.785 15h 8m 54s  15h 8m 54s  236.1%  14.9 GB 547 GB  230.4 GB    197.9 GB
19  1d/1fac11   763094  pipeline:differential_expression:build_minimap_index_transcriptome  COMPLETED   0   2024-01-10 01:44:32.047 13.9s   13.1s   280.4%  2.6 GB  6.8 GB  428.7 MB    897 MB
21  2b/a2a344   763869  pipeline:build_minimap_index    COMPLETED   0   2024-01-10 01:44:46.005 47.3s   46.6s   317.3%  6.4 GB  11.1 GB 2.9 GB  3.6 GB
50  d3/11dc19   765042  pipeline:differential_expression:map_transcriptome (9)  COMPLETED   0   2024-01-10 01:45:33.289 40m 39m 59s 3080.5% 50.3 GB 58.6 GB 181.1 GB    182.9 GB
47  fa/35d6fd   784700  pipeline:differential_expression:map_transcriptome (6)  COMPLETED   0   2024-01-10 02:25:33.095 38m 16s 38m 15s 3290.4% 50.2 GB 58.7 GB 172 GB  173.4 GB
43  a2/921057   803875  pipeline:differential_expression:map_transcriptome (2)  COMPLETED   0   2024-01-10 03:03:49.179 30m 40s 30m 39s 3166.5% 50.1 GB 58.7 GB 142.6 GB    143.4 GB
42  b7/17b38a   820879  pipeline:differential_expression:map_transcriptome (1)  COMPLETED   0   2024-01-10 03:34:28.745 26m 18s 26m 17s 3214.2% 50.3 GB 57.6 GB 123.2 GB    123.5 GB
49  b2/3cb284   836666  pipeline:differential_expression:map_transcriptome (8)  COMPLETED   0   2024-01-10 04:00:46.539 38m 55s 38m 54s 3200.9% 50.6 GB 58.7 GB 176.9 GB    177.9 GB
48  3a/6ea934   856101  pipeline:differential_expression:map_transcriptome (7)  COMPLETED   0   2024-01-10 04:39:41.632 37m 34s 37m 34s 3238.1% 50.4 GB 58.7 GB 173.5 GB    174.2 GB
51  9f/5d4aa0   875284  pipeline:differential_expression:map_transcriptome (10) COMPLETED   0   2024-01-10 05:17:16.023 42m 22s 42m 21s 3142.9% 50.4 GB 58.8 GB 193.5 GB    196.9 GB
46  a1/2b295e   895628  pipeline:differential_expression:map_transcriptome (5)  COMPLETED   0   2024-01-10 05:59:37.957 35m 14s 35m 13s 3323.3% 50.4 GB 58.7 GB 162.5 GB    163.3 GB
44  b2/dcbe19   914079  pipeline:differential_expression:map_transcriptome (3)  COMPLETED   0   2024-01-10 06:34:52.010 31m 39s 31m 38s 3164.9% 50.3 GB 58.7 GB 145 GB  145.9 GB
45  7d/3bc014   931319  pipeline:differential_expression:map_transcriptome (4)  COMPLETED   0   2024-01-10 07:06:30.688 33m 42s 33m 41s 3063.8% 50.4 GB 58.7 GB 156.9 GB    159.4 GB
59  d4/c205bf   949144  pipeline:reference_assembly:map_reads (8)   COMPLETED   0   2024-01-10 07:40:12.773 2d 10h 51m 8s   2d 10h 51m 8s   101.2%  9.8 GB  12.1 GB 243.5 GB    176.2 GB
53  d0/207619   1931583 pipeline:reference_assembly:map_reads (2)   COMPLETED   0   2024-01-12 18:31:21.193 1d 16h 27m 36s  1d 16h 27m 35s  101.4%  9.9 GB  12.2 GB 193 GB  139.4 GB
56  a6/722b60   2611250 pipeline:reference_assembly:map_reads (5)   COMPLETED   0   2024-01-14 10:58:56.848 19h 30m 37s 19h 30m 36s 103.8%  9.8 GB  12 GB   208.2 GB    150 GB

Application activity log entry

No response

nrhorner commented 5 months ago

Hi @KatrinMoller

Thanks for reporting this. You have uncovered a bug, and we'll get a fix out straight away.

KatrinMoller commented 5 months ago

Any news on this? In the meantime my run stopped since the map_reads function (via minimap2) seems to be storing data for previous samples on the RAM so it filled up. I ran this pipeline (previous version) before with really good results, would love to see an update soon :)

nrhorner commented 5 months ago

Hi @KatrinMoller

There is a fix for this on the prerelease branch. Would you be able to test it out please? You can use it with nextflow run epi2me-labs/wf-transcriptomes -r prerelease

Thanks,

Neil

KatrinMoller commented 5 months ago

Hi @nrhorner Thanks, I am testing it now, will let you know how it goes.

KatrinMoller commented 4 months ago

@nrhorner So I ran the pipeline using the -r prerelease option and it got very quickly through the commands that previously took ages, so I think that problem was solved. However, now it got stuck in the differential expression analysis: <pipeline:differential_expression:deAnalysis` terminated with an error exit status (1)> There were 3 attempts at solving this but then the pipeline terminated with an error status. The pipeline managed to export .fas, .bam and .bai files for some of the samples, but not all. I think this is an unrelated error to the previous one, perhaps should have another thread for it. Would really appreciate help on solving this, not sure what info you need to figure it out though?

nrhorner commented 4 months ago

Hi @KatrinMoller

Thanks for the update. Yes please open another ticket for the new issue. I'll close this one