Closed jagos01 closed 1 year ago
Hi, It's not ideal but currently the workflow expects input directories to be structured like nextflow run epi2me-labs/wf-metagenomics --fastq /home/data1/Analyzed_data/metactrrun2/guppy_6.1.7 --kraken2 --threads 20
. We will amend this shortly in the next release.
Thanks. This worked for the included test data with the default database but failed with the following error when I tried to use a local K2 database:
command: nextflow run epi2me-labs/wf-metagenomics --fastq '/home/scott/.nextflow/assets/epi2me-labs/wf-metagenomics/test_data' --kraken2 --database /home/scott/Data_DB/Kraken2_db/k2_pluspfp_20220607
[af/3f34f9] NOTE: Process kraken_pipeline:kraken2_client (1)
terminated with an error exit status (74) -- Execution is retried (1)
[a0/4bd77d] NOTE: Process kraken_pipeline:kraken2_client (2)
terminated with an error exit status (74) -- Execution is retried (1)
[63/e1b4c2] NOTE: Process kraken_pipeline:kraken2_client (3)
terminated with an error exit status (74) -- Execution is retried (1)
[b7/bead2c] NOTE: Process kraken_pipeline:kraken2_client (4)
terminated with an error exit status (74) -- Execution is retried (1)
Error executing process > 'kraken_pipeline:kraken2_client (1)'
Caused by:
Process kraken_pipeline:kraken2_client (1)
terminated with an error exit status (74)
Command executed:
kraken2_client --port 8080 --sequence "barcode01.2.fastq.gz" > "barcode01.kraken2.assignments.tsv" kraken2_client --port 8080 --report --sequence "barcode01.2.fastq.gz" > "out.txt" tail -n +2 "out.txt" > "tmp.txt" head -n -6 "tmp.txt" > "barcode01.kraken2_report.txt"
Command exit status: 74
Command output: (empty)
Command error: Connecting to server: localhost:8080. Extracting sequences from file: barcode01.2.fastq.gz Sequences extracted successfully. Uploading sequences... Sequences uploaded. Awaiting classification results... Sequence Stream RPC failed: failed to connect to all addresses
I encountered a different error when I ran a set of my sequences using the default k2 database.
command: nextflow run epi2me-labs/wf-metagenomics --fastq /home/scott/Desktop/test_seqs --kraken2
Error executing process > 'kraken_pipeline:bracken (1)'
Caused by:
Process kraken_pipeline:bracken (1)
terminated with an error exit status (1)
Command executed:
run_bracken.py "database_dir" "reports.1/unclassified.kreport.txt" "1000" "S" "unclassified.bracken_report.txt"
mv "reports.1/unclassified.kreport_bracken_species.txt" .
awk '{ print $3,$7}' "unclassified.bracken_report.txt" | awk 'NR!=1 {print}' > taxacounts.txt
awk '{print $3}' "unclassified.bracken_report.txt" | awk 'NR!=1 {print}' > taxa.txt
taxonkit --data-dir taxonomy_dir lineage -R taxa.txt > lineages.txt
aggregate_lineages_bracken.py -i "lineages.txt" -b "taxacounts.txt" -p "unclassified.kraken2"
file1=cat *.json
echo "{"'"unclassified"'": "$file1"}" >> "unclassified.1.json"
cp "unclassified.1.json" "reports.1/unclassified.json"
Command exit status: 1
Command output: b' >> Checking for Valid Options...\n >> Running Bracken \n >> python src/est_abundance.py -i reports.1/unclasified.kreport.txt -o unclassified.bracken_report.txt -k database_dir/database1000mers.kmer_distrib -l S -t 10\nPROGRAM START TIME: 10-20-2022 16:20:49\n'
Command error: b' >> Checking for Valid Options...\n >> Running Bracken \n >> python src/est_abundance.py -i reports.1/unclassified.kreport.txt -o unclassified.bracken_report.txt -k database_dir/database1000mers.kmer_distrib -l S -t 10\nPROGRAM START TIME: 10-20-2022 16:20:49\n'b'>> Checking report file: reports.1/unclassified.kreport.txt\nError: no reads found. Please check your Kraken report\n'mv: cannot stat 'reports.1/unclassified.kreport_bracken_species.txt': No such file or directory
Any help with these errors is appreciated. Thanks
Hi, Thanks for finding these bugs. I am just trying to recreate your errors so I can fix the problem. The first one, how big was the k2_pluspfp database directory and does it contain all the kraken2 files required hash,opts,taxo + database1000mers file? Is it one from here? https://benlangmead.github.io/aws-indexes/k2. If so I can test it. For the second one could you try changing the parameter --bracken_level
to 'G' or 'F' and see if you get any results?
Hello Sarah, Yes the database was from the site you mentioned. It was the full k2_pluspfp database so I believe 129 GB. I will try changing the bracken_level shortly. Thanks, Scott
Changing the bracken_level to Family or Genus failed to complete and produced the following errors Braken_level = 'G' Error executing process > 'kraken_pipeline:bracken (1)'
Caused by:
Process kraken_pipeline:bracken (1)
terminated with an error exit status (1)
Command executed:
run_bracken.py "database_dir" "reports.1/barcode15.kreport.txt" "1000" "G" "barcode15.bracken_report.txt"
mv "reports.1/barcode15.kreport_bracken_species.txt" .
awk '{ print $3,$7}' "barcode15.bracken_report.txt" | awk 'NR!=1 {print}' > taxacounts.txt
awk '{print $3}' "barcode15.bracken_report.txt" | awk 'NR!=1 {print}' > taxa.txt
taxonkit --data-dir taxonomy_dir lineage -R taxa.txt > lineages.txt
aggregate_lineages_bracken.py -i "lineages.txt" -b "taxacounts.txt" -p "barcode15.kraken2"
file1=cat *.json
echo "{"'"barcode15"'": "$file1"}" >> "barcode15.1.json"
cp "barcode15.1.json" "reports.1/barcode15.json"
Command exit status: 1
Command output: b' >> Checking for Valid Options...\n >> Running Bracken \n >> python src/est_abundance.py -i reports.1/barcode15.kreport.txt -o barcode15.bracken_report.txt -k database_dir/database1000mers.kmer_distrib -l G -t 10\nPROGRAM START TIME: 10-21-2022 20:24:51\n'
Command error: b'>> Checking report file: reports.1/barcode15.kreport.txt\nError: no reads found. Please check your Kraken report\n'b' >> Checking for Valid Options...\n >> Running Bracken \n >> python src/est_abundance.py -i reports.1/barcode15.kreport.txt -o barcode15.bracken_report.txt -k database_dir/database1000mers.kmer_distrib -l G -t 10\nPROGRAM START TIME: 10-21-2022 20:24:51\n'mv: cannot stat 'reports.1/barcode15.kreport_bracken_species.txt': No such file or directory
Braken_level = 'F' Error executing process > 'kraken_pipeline:bracken (1)'
Caused by:
Process kraken_pipeline:bracken (1)
terminated with an error exit status (1)
Command executed:
run_bracken.py "database_dir" "reports.1/barcode15.kreport.txt" "1000" "F" "barcode15.bracken_report.txt"
mv "reports.1/barcode15.kreport_bracken_species.txt" .
awk '{ print $3,$7}' "barcode15.bracken_report.txt" | awk 'NR!=1 {print}' > taxacounts.txt
awk '{print $3}' "barcode15.bracken_report.txt" | awk 'NR!=1 {print}' > taxa.txt
taxonkit --data-dir taxonomy_dir lineage -R taxa.txt > lineages.txt
aggregate_lineages_bracken.py -i "lineages.txt" -b "taxacounts.txt" -p "barcode15.kraken2"
file1=cat *.json
echo "{"'"barcode15"'": "$file1"}" >> "barcode15.1.json"
cp "barcode15.1.json" "reports.1/barcode15.json"
Command exit status: 1
Command output: b' >> Checking for Valid Options...\n >> Running Bracken \n >> python src/est_abundance.py -i reports.1/barcode15.kreport.txt -o barcode15.bracken_report.txt -k database_dir/database1000mers.kmer_distrib -l F -t 10\nPROGRAM START TIME: 10-21-2022 20:23:16\n'
Command error: b' >> Checking for Valid Options...\n >> Running Bracken \n >> python src/est_abundance.py -i reports.1/barcode15.kreport.txt -o barcode15.bracken_report.txt -k database_dir/database1000mers.kmer_distrib -l F -t 10\nPROGRAM START TIME: 10-21-2022 20:23:16\n'b'>> Checking report file: reports.1/barcode15.kreport.txt\nError: no reads found. Please check your Kraken report\n'mv: cannot stat 'reports.1/barcode15.kreport_bracken_species.txt': No such file or directory
Thanks
On Fri., Oct. 21, 2022, 10:10 a.m. Sarah Griffiths, < @.***> wrote:
Hi, I am just trying to recreate your errors so I can fix the problem. The first one, how big was the k2_pluspfp database directory and does it contain all the kraken2 files required hash,opts,taxo + database1000mers file? Is it one from here? https://benlangmead.github.io/aws-indexes/k2. If so I can test it. For the second one could you try changing the parameter --bracken_level to 'G' or 'F' and see if you get any results?
— Reply to this email directly, view it on GitHub https://github.com/epi2me-labs/wf-metagenomics/issues/13#issuecomment-1287169126, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALWFTRHMRC5JXAHCBS5GI4DWEK6ABANCNFSM6AAAAAARJCWOJI . You are receiving this because you authored the thread.Message ID: @.***>
Hi, Thank you for using the workflow. Could you confirm if this issue has been solved? We'll close this ticket on the assumption things are now resolved.
Hello, The workflow was unable to finish with this data set. I instead used Kraken2 to analyze this data. Thanks
Hi, Could you send the error that you're observing (the parameters and the versions of the workflow and EPI2MElabs (if you are using it)? From your previous report I'll see that you are using the k2_pluspfp database, which is the RAM memory you have available to run the workflow, given that it must be slightly higher than size of the database.
The workflow was given 192GB RAM. I ran several datasets today with the latest version (2.2.1) and all completed without issues. Thanks,
What happened?
System - Ubuntu 18.04 with Nextflow 22.10.0 and wf-metagenomics v2.0.0 (b2bd2b89da)
Command - nextflow run epi2me-labs/wf-metagenomics --fastq /home/data1/Analyzed_data/metactrrun2/guppy_6.1.7/demux_trim/BC96_ZymoStd.fastq.gz --kraken2 --threads 20
Output - hangs on process > kraken_pipeline:kraken_server (left running for 24hrs) and kraken2_client never starts.
Checking inputs. executor > local (7) [d8/3d42fa] process > kraken_pipeline:unpackTaxonomy [100%] 1 of 1 ✔ [2f/ca0892] process > kraken_pipeline:unpackDatabase [100%] 1 of 1 ✔ [b8/759512] process > kraken_pipeline:kraken_server [ 0%] 0 of 1 [- ] process > kraken_pipeline:combineFilterFastq - [- ] process > kraken_pipeline:progressiveStats - [- ] process > kraken_pipeline:kraken2_client - [- ] process > kraken_pipeline:progressive_kreports - [- ] process > kraken_pipeline:taxon_kit - [- ] process > kraken_pipeline:bracken - [fd/a60b1a] process > kraken_pipeline:getVersions [100%] 1 of 1 ✔ [38/ed71ac] process > kraken_pipeline:getParams [100%] 1 of 1 ✔ [- ] process > kraken_pipeline:makeReport - [- ] process > kraken_pipeline:mergeclassifiedProgressive - [- ] process > kraken_pipeline:mergeunclassifiedProgressive - [- ] process > kraken_pipeline:catAssignmentsprogressive - [- ] process > kraken_pipeline:stop_kraken_server - [- ] process > kraken_pipeline:output - [- ] process > kraken_pipeline:output_dir - [2f/14c7e9] process > output (1) [100%] 2 of 2 ✔
I have also tried this workflow on a different system running Ubuntu 20.04. The workflow failed regardless if it was executed with Docker or Conda.
Operating System
ubuntu 18.04
Workflow Execution
Command line
Workflow Execution - EPI2ME Labs Versions
No response
Workflow Execution - Execution Profile
Docker
Workflow Version
b2bd2b89da
Relevant log output
I have also tried EPI2ME Labs (v3.15) with Labs environment v1.2.5. The workflow was terminated due to the following error:
Checking epi2me-labs/wf-metagenomics ...
epi2me-labs/wf-metagenomics contains uncommitted changes -- cannot pull from repository
N E X T F L O W ~ version 22.04.0
Project
epi2me-labs/wf-metagenomics
contains uncommitted changes -- Cannot switch to revision: v1.1.4