epi2me-labs / wf-metagenomics

Metagenomic classification of long-read sequencing data
Other
62 stars 23 forks source link

Ran out of Memory Even When Using Memory Mapping #92

Closed peradastra closed 6 months ago

peradastra commented 7 months ago

Operating System

Ubuntu 22.04

Other Linux

No response

Workflow Version

v2.9.3-g6636bc9

Workflow Execution

Command line (Cluster)

Other workflow execution

No response

EPI2ME Version

No response

CLI command run

No response

Workflow Execution - CLI Execution Profile

standard (default)

What happened?

Ran out of memory trying to run nt database.

Relevant log output

(base) [hilaire@ad.bcm.edu@rpv-oitghp-p02 wf-meta]$ conda activate nextflow
(nextflow) [hilaire@ad.bcm.edu@rpv-oitghp-p02 wf-meta]$ bash wf-meta.sh 
N E X T F L O W  ~  version 23.10.1
Launching `https://github.com/epi2me-labs/wf-metagenomics` [tiny_woese] DSL2 - revision: 6636bc9044 [master]
WARN: NEXTFLOW RECURSION IS A PREVIEW FEATURE - SYNTAX AND FUNCTIONALITY CAN CHANGE IN FUTURE RELEASES

||||||||||   _____ ____ ___ ____  __  __ _____      _       _
||||||||||  | ____|  _ \_ _|___ \|  \/  | ____|    | | __ _| |__  ___
|||||       |  _| | |_) | |  __) | |\/| |  _| _____| |/ _` | '_ \/ __|
|||||       | |___|  __/| | / __/| |  | | |__|_____| | (_| | |_) \__ \
||||||||||  |_____|_|  |___|_____|_|  |_|_____|    |_|\__,_|_.__/|___/
||||||||||  wf-metagenomics v2.9.3-g6636bc9
--------------------------------------------------------------------------------
Core Nextflow options
  revision                   : master
  runName                    : tiny_woese
  containerEngine            : docker
  container                  : [withLabel:wfmetagenomics:ontresearch/wf-metagenomics:sha44a6dacff5f2001d917b774647bb4cbc1b53bc76, withLabel:wf_common:ontresearch/wf-common:sha645176f98b8780851f9c476a064d44c2ae76ddf6, withLabel:amr:ontresearch/abricate:sha2c763f19fac46035437854f1e2a5f05553542a78]
  launchDir                  : /home/ad.bcm.edu/hilaire/wf-meta
  workDir                    : /home/ad.bcm.edu/hilaire/wf-meta/work
  projectDir                 : /home/ad.bcm.edu/hilaire/.nextflow/assets/epi2me-labs/wf-metagenomics
  userName                   : hilaire@ad.bcm.edu
  profile                    : standard
  configFiles                : /home/ad.bcm.edu/hilaire/.nextflow/assets/epi2me-labs/wf-metagenomics/nextflow.config

Input Options
  fastq                      : seqid_fastqs

Sample Options
  sample_sheet               : 032824MC110414FAX26329.csv

Reference Options
  database                   : /mnt/scratch/k2_nt_20231129/
  database_sets              : [ncbi_16s_18s:[reference:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/ncbi_targeted_loci_16s_18s.fna, database:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/ncbi_targeted_loci_kraken2.tar.gz, ref2taxid:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/ref2taxid.targloci.tsv, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/taxdmp_2023-01-01.zip], ncbi_16s_18s_28s_ITS:[reference:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ncbi_16s_18s_28s_ITS.fna, database:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ncbi_16s_18s_28s_ITS_kraken2.tar.gz, ref2taxid:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ref2taxid.ncbi_16s_18s_28s_ITS.tsv, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/taxdmp_2023-01-01.zip], SILVA_138_1:[database:null], Standard-8:[database:https://genome-idx.s3.amazonaws.com/kraken/k2_standard_08gb_20231009.tar.gz, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/new_taxdump_2023-03-01.zip], PlusPF-8:[database:https://genome-idx.s3.amazonaws.com/kraken/k2_pluspf_08gb_20230314.tar.gz, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/new_taxdump_2023-03-01.zip], PlusPFP-8:[database:https://genome-idx.s3.amazonaws.com/kraken/k2_pluspfp_08gb_20230314.tar.gz, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/new_taxdump_2023-03-01.zip]]

Kraken2 Options
  kraken2_memory_mapping     : true
  include_kraken2_assignments: true

Advanced Options
  threads                    : 24

Miscellaneous Options
  disable_ping               : true

!! Only displaying parameters that differ from the pipeline defaults !!
--------------------------------------------------------------------------------
If you use epi2me-labs/wf-metagenomics for your analysis please cite:

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

--------------------------------------------------------------------------------
This is epi2me-labs/wf-metagenomics v2.9.3-g6636bc9.
--------------------------------------------------------------------------------
Checking inputs.
Note: Reference/Database are custom.
Note: Memory available to the workflow must be slightly higher than size of the database custom index.
Note: Or consider to use the --kraken2_memory_mapping.
Note: Memory available to the workflow must be slightly higher than size of the database Standard-8 index (8GB) or consider to use --kraken2_memory_mapping
Searching input for [.fastq, .fastq.gz, .fq, .fq.gz] files.
[-        ] process > validate_sample_sheet -
[-        ] process > validate_sample_sheet                           -
executor >  local (3)executor >  local (3)
executor >  local (3)
executor >  local (4)
executor >  local (4)
executor >  local (5)
executor >  local (5)
executor >  local (19)
executor >  local (19)
executor >  local (19)
executor >  local (19)
[65/99a592] process > validate_sample_sheet                           [100%] 1 of 1 ✔[fe/c79f0f] process > fastcat (6)                                     [  7%] 1 of 14
[skipped  ] process > prepare_databases:download_unpack_taxonomy      [100%] 1 of 1, stored: 1 ✔[skipped  ] process > prepare_databases:determine_bracken_length      [100%] 1 of 1, stored: 1 ✔[7b/29b507] process > kraken_pipeline:run_common:getVersions          [100%] 1 of 1 ✔[f8/40e426] process > kraken_pipeline:run_common:getParams            [100%] 1 of 1 ✔[-        ] process > kraken_pipeline:run_kraken2                     [  0%] 0 of 1executor >  local (19)[65/99a592] process > validate_sample_sheet                           [100%] 1 of 1 ✔[64/49c7f2] process > fastcat (10)                                    [100%] 1 of 1[skipped  ] process > prepare_databases:download_unpack_taxonomy      [100%] 1 of 1, stored: 1 ✔[skipped  ] process > prepare_databases:determine_bracken_length      [100%] 1 of 1, stored: 1 ✔[7b/29b507] process > kraken_pipeline:run_common:getVersions          [100%] 1 of 1 ✔[f8/40e426] process > kraken_pipeline:run_common:getParams            [100%] 1 of 1 ✔
[15/9d045a] process > kraken_pipeline:run_kraken2 (1123SEQID066-N029) [100%] 1 of 1, failed: 1
[-        ] process > kraken_pipeline:run_bracken                     -
[-        ] process > kraken_pipeline:createAbundanceTables           -
[-        ] process > kraken_pipeline:makeReport                      -
[-        ] process > kraken_pipeline:output_kraken2_read_assignments -
[5f/950773] process > kraken_pipeline:output_results (2)              [100%] 2 of 2
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Kraken2 pipeline.
Preparing databases.
Using default taxonomy database.
Checking custom kraken2 database exists
Using the bracken dist file within your custom database directory.
[skipping] Stored process > prepare_databases:download_unpack_taxonomy
[skipping] Stored process > prepare_databases:determine_bracken_length
ERROR ~ Consider to use --kraken2_memory_mapping to reduce the use of RAM memory.

 -- Check '.nextflow.log' file for detailsERROR ~ Error executing process > 'kraken_pipeline:run_kraken2 (1123SEQID066-N029)'

Caused by:
  Process requirement exceeds available memory -- req: 712.6 GB; avail: 502.8 GB

Command executed:

  kraken2 --db k2_nt_20231129 seqs.fastq.gz         --threads 24         --report "1123SEQID066-N029.kraken2.report.txt"         --confidence 0 --memory-mapping > "1123SEQID066-N029.kraken2.assignments.tsv"

Command exit status:
  -

Command output:
  (empty)

Work dir:
  /home/ad.bcm.edu/hilaire/wf-meta/work/15/9d045aad227403758ff86e23ffc723

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

Able to run smaller (55 Gb) database successfully with same data.
nggvs commented 7 months ago

Hi @peradastra , Thank you for using the workflow! This should have been solved with release 2.9.4. Please let me know if that works for you.

nggvs commented 6 months ago

Hi @peradastra , Were you able to run successfully the workflow with the latest version?

nggvs commented 6 months ago

Hi @peradastra , I'm closing the issue due to no response, I hope you have been able to run the workflow successfully. If not, please feel free to open a new issue!