Closed fwa93 closed 1 year ago
Did you see the warning message that 'Input directory assumed to be containing one or more directories containing fastq files'. Maybe try /data/fwa_test2/
as input fastq dir.
Thank you for the help @sarahjeeeze! It worked very well. But, what do you do if you have several directories containing fastq.gz files inside e.g., /data/fwa_test2/ but you only want to analyse one of the directories?
Hi there, I'm also encountering the same problem from time to time. I use a custom database and sometimes the workflow runs while other times it stalls for ours in this same point.
I use the following command:
nextflow run epi2me-labs/wf-metagenomics --fastq test/fastq_pass --kraken2 --database /dbs/VIRUS/ --threads 20 --outdir /wf-metagenomics/test
The fastq_pass
directory contains more folders (barcode01
, barcode02
,barcode03
) with several fastq
files in them.
test/fastq_pass/
├── barcode01
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_0.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_1.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_10.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_11.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_12.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_13.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_2.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_3.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_4.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_5.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_6.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_7.fastq.gz
│ ├── FAV06017_pass_barcode01_c79548f3_ce9f168a_8.fastq.gz
│ └── FAV06017_pass_barcode01_c79548f3_ce9f168a_9.fastq.gz
├── barcode02
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_0.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_1.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_10.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_11.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_12.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_13.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_14.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_15.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_16.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_17.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_18.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_19.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_2.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_20.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_21.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_3.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_4.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_5.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_6.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_7.fastq.gz
│ ├── FAV06017_pass_barcode02_c79548f3_ce9f168a_8.fastq.gz
│ └── FAV06017_pass_barcode02_c79548f3_ce9f168a_9.fastq.gz
└── barcode03
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_0.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_1.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_10.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_11.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_12.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_13.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_14.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_15.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_16.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_17.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_2.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_3.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_4.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_5.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_6.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_7.fastq.gz
├── FAV06017_pass_barcode03_c79548f3_ce9f168a_8.fastq.gz
└── FAV06017_pass_barcode03_c79548f3_ce9f168a_9.fastq.gz
It won't continue after this point:
--------------------------------------------------------------------------------
This is epi2me-labs/wf-metagenomics v2.0.8-gb19c50e.
--------------------------------------------------------------------------------
Checking inputs.
Checking custom kraken2 database exists
executor > local (24)
[skipped ] process > kraken_pipeline:unpackTaxonomy [100%] 1 of 1, stored: 1 ✔
[skipped ] process > kraken_pipeline:unpackDatabase [100%] 1 of 1, stored: 1 ✔
[15/e0401a] process > kraken_pipeline:determine_bracken_length [100%] 1 of 1 ✔
[- ] process > kraken_pipeline:kraken_server [ 0%] 0 of 1
[54/ea9e08] process > kraken_pipeline:kraken2_client (7) [ 0%] 0 of 54
[- ] process > kraken_pipeline:progressive_stats -
[- ] process > kraken_pipeline:progressive_kraken_reports -
[- ] process > kraken_pipeline:progressive_bracken -
[5d/ad1a3e] process > kraken_pipeline:getVersions [100%] 1 of 1 ✔
[5f/ad28ce] process > kraken_pipeline:getParams [100%] 1 of 1 ✔
[- ] process > kraken_pipeline:makeReport -
[e1/94b369] process > kraken_pipeline:output (2) [100%] 2 of 2
[- ] process > kraken_pipeline:stop_kraken_server -
Input directory assumed to be containing one or more directories containing fastq files.
[skipping] Stored process > kraken_pipeline:unpackDatabase
[skipping] Stored process > kraken_pipeline:unpackTaxonomy
Any suggestions?
Hi! Checking in to see if you found a solution yet. I am getting the same problem. It keeps running but doesn't progress from "unpackTaxonomy." I'd be grateful for any insight you may have.
Hi, How big is your database directory? If it is larger than 8gb you will need to update the configuration file, you can do this by assigning the -c
parameter to a config file eg. large_mem.config
and adding
executor {
$local {
cpus = 8
memory = "8 GB" <-- update this to match or ideally be slightly above the size of your database
}
Thanks for the reply and suggestion Sarah. I am attempting to use 2 different databases separately. I am using PlusPF8 and the viral database.
I upped the CPUs and memory to 10.
When I use PlusPF8, the workflow just runs I get no output and the log is:
WARN: Access to undefined parameter kraken2bracken
-- Initialise it to a default value eg. params.kraken2bracken = some_value
Input directory assumed to be containing one or more directories containing fastq files.
[f4/087b3f] Submitted process > kraken_pipeline:getVersions
[12/8792fc] Submitted process > kraken_pipeline:getParams
[e1/c309ae] Submitted process > kraken_pipeline:output (1)
Staging foreign file: https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz
Staging foreign file: https://genome-idx.s3.amazonaws.com/kraken/k2_pluspf_8gb_20210517.tar.gz
[99/2b63ed] Submitted process > kraken_pipeline:output (2)
When I use the viral database it says run completed with no actual output other than the timeline and the nextflow report and the log is:
Checking custom kraken2 database exists
Input directory assumed to be containing one or more directories containing fastq files.
[6f/a2d01e] Submitted process > kraken_pipeline:getParams
[07/d10465] Submitted process > kraken_pipeline:getVersions
[e1/c4478e] Submitted process > kraken_pipeline:output (1)
Staging foreign file: https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz
[22/a530e1] Submitted process > kraken_pipeline:output (2)
Staging foreign file: https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/database1000mers.kmer_distrib
[d2/1e63ec] Submitted process > kraken_pipeline:unpackDatabase
[bb/588c44] Submitted process > kraken_pipeline:kraken_server
[64/4a5713] Submitted process > kraken_pipeline:determine_bracken_length
[bb/588c44] NOTE: Process kraken_pipeline:kraken_server
terminated with an error exit status (65) -- Error is ignored
[32/f6728b] Submitted process > kraken_pipeline:unpackTaxonomy
From: Sarah Griffiths @.> Sent: Monday, March 6, 2023 11:47 AM To: epi2me-labs/wf-metagenomics @.> Cc: Rowland, Jessica (CDC/DDID/NCEZID/DHCPP) @.>; Comment @.> Subject: Re: [epi2me-labs/wf-metagenomics] [Bug]: Stalls at kraken_pipeline:kraken_server (Issue #15)
Hi, How big is your database directory? If it is larger than 8gb you will need to update the configuration file, you can do this by assigning the -c parameter to a config file eg. large_mem.config and adding
executor { $local { cpus = 8 memory = "8 GB" <-- update this to match or ideally be slightly above the size of your database }
— Reply to this email directly, view it on GitHubhttps://github.com/epi2me-labs/wf-metagenomics/issues/15#issuecomment-1456498410, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A5C75ABKRCUKJYNV3SSQFALW2YIIRANCNFSM6AAAAAARXHFKHM. You are receiving this because you commented.Message ID: @.***>
@sarahjeeeze I eventually ended up playing around with those settings and that seems to work. My database was about 67 GB so I upped the memory to about double that and it worked. I also increased the CPUs.
@eparisis
Hi, what did you increase the CPUs to? I'm trying to run the PlusPF-8 Database (8GB), should I increase the CPUs to 8 and the memory to 16?
To run the PlusPF-8 DB try running it with 10 GB memory first to see if it starts the analysis, if not then increase it further. I just doubled it to be sure but I'm not sure if it makes it any faster allocating so much memory, I haven't tested it. For CPUs just use some cores fewer than your max on your machine. I have 16 so I use like 12. I did run the PlusPF DB which is about 67 GB large.
Hi, Thank you for using the workflow.
We have included in the latest release (2.3.0) a flag --kraken2_memory_mapping
which acts as the --memory-mapping
for kraken2. Kraken 2 will by default load the database into process-local RAM; this flag will avoid doing so.
If the problem has not already be solved please let us know, or we'll close this ticket on the assumption things are now resolved.
What happened?
wf-metagenomics stalls at process kraken_pipeline:kraken_server I use wf-metagenomics v2.0.1 and nextflow 22.04.1. I do not get issues when running wf-metagenomics v1.1.4
Command
nextflow run epi2me-labs/wf-metagenomics --fastq /data/fwa_test2/fastq_mock/ --kraken2 --threads 8
Operating System
ubuntu 20.04
Workflow Execution
Command line
Workflow Execution - EPI2ME Labs Versions
No response
Workflow Execution - Execution Profile
Docker
Workflow Version
v2.0.1
Relevant log output