epi2me-labs / wf-metagenomics

Metagenomic classification of long-read sequencing data
Other
58 stars 23 forks source link

Error running real time pipeline #116

Closed sarah-buddle closed 2 weeks ago

sarah-buddle commented 3 months ago

Operating System

CentOS 7

Other Linux

No response

Workflow Version

v2.10.1-g4c2c583

Workflow Execution

Command line (Cluster)

Other workflow execution

No response

EPI2ME Version

No response

CLI command run

nextflow run epi2me-labs/wf-metagenomics -profile singularity -w ${results}/work \ --fastq $demux_outdir/barcode21_LIB00220 \ --classifier kraken2 \ --exclude_host $human_genome \ --database $kraken_db_pluspf8 \ --taxonomy $taxdump \ --ref2taxid $ref2taxid \ --out_dir ${results}/epi2me2 \ --real_time true

Workflow Execution - CLI Execution Profile

singularity

What happened?

I am trying to run the real-time analysis pipeline on our HPC. I can run the exact command provided without the --realtime parameter and it completes without errors. But when I add the real_time parameter (and use a fresh output directory), the pipeline fails - see log. I don't quite understand from the documentation if I need to use any of the other real time parameters or if this is another issue.

Relevant log output

Core Nextflow options
  revision       : master
  runName        : hungry_poitras
  containerEngine: singularity
  launchDir      : /SAN/breuerlab/BTRU-scratch/sarah/results/ont
  workDir        : /SAN/breuerlab/BTRU-scratch/sarah/results/ont/work
  projectDir     : /home/sbuddle/.nextflow/assets/epi2me-labs/wf-metagenomics
  userName       : sbuddle
  profile        : singularity
  configFiles    : /home/sbuddle/.nextflow/assets/epi2me-labs/wf-metagenomics/nextflow.config

Input Options
  fastq          : /SAN/breuerlab/BTRU-scratch/sarah/results/ont/demultiplex/realtime_test/barcode21_LIB00220
  exclude_host   : /SAN/breuerlab/BTRU-scratch/sarah/data/human_genome/gencode_GRCh38.p14/GRCh38.p14.genome.mmi

Real Time Analysis Options
  real_time      : true

Reference Options
  database       : /SAN/breuerlab/BTRU-scratch2/sarah/software/pluspf-8_140323
  taxonomy       : /SAN/breuerlab/BTRU-scratch2/sarah/software/taxdump/new_taxdump_2023-08-01
  ref2taxid      : /SAN/breuerlab/BTRU-scratch2/sarah/software/accession2taxid/2023-07-31/nucl_gb.accession2taxid
  database_sets  : [ncbi_16s_18s:[reference:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/ncbi_targeted_loci_16s_18s.fna, database:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/ncbi_targeted_loci_kraken2.tar.gz, ref2taxid:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s/ref2taxid.targloci.tsv, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/taxdmp_2023-01-01.zip], ncbi_16s_18s_28s_ITS:[reference:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ncbi_16s_18s_28s_ITS.fna, database:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ncbi_16s_18s_28s_ITS_kraken2.tar.gz, ref2taxid:https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-metagenomics/ncbi_16s_18s_28s_ITS/ref2taxid.ncbi_16s_18s_28s_ITS.tsv, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/taxdmp_2023-01-01.zip], SILVA_138_1:[database:null], Standard-8:[database:https://genome-idx.s3.amazonaws.com/kraken/k2_standard_08gb_20231009.tar.gz, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/new_taxdump_2023-03-01.zip], PlusPF-8:[database:https://genome-idx.s3.amazonaws.com/kraken/k2_pluspf_08gb_20230314.tar.gz, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/new_taxdump_2023-03-01.zip], PlusPFP-8:[database:https://genome-idx.s3.amazonaws.com/kraken/k2_pluspfp_08gb_20230314.tar.gz, taxonomy:https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump_archive/new_taxdump_2023-03-01.zip]]     

Output Options
  out_dir        : /SAN/breuerlab/BTRU-scratch/sarah/results/ont/epi2me2

!! Only displaying parameters that differ from the pipeline defaults !!
--------------------------------------------------------------------------------
If you use epi2me-labs/wf-metagenomics for your analysis please cite:

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

--------------------------------------------------------------------------------
This is epi2me-labs/wf-metagenomics v2.10.1-g4c2c583.
--------------------------------------------------------------------------------
Checking inputs.
Note: Reference/Database are custom.
Note: Memory available to the workflow must be slightly higher than size of the database custom index.
Note: Or consider to use the --kraken2_memory_mapping.
Note: Taxonomy database is custom.
Searching input for [.fastq, .fastq.gz, .fq, .fq.gz] files.
[-        ] process > fastcat -
[-        ] process > fastcat                                          -
[-        ] process > fastcat                                          -
[-        ] process > fastcat                                          -
executor >  local (10)
executor >  local (16)
executor >  local (16)
executor >  local (27)
executor >  local (34)
executor >  local (38)
executor >  local (42)
executor >  local (43)
executor >  local (51)
executor >  local (51)
executor >  local (55)
executor >  local (63)
executor >  local (67)
executor >  local (67)
executor >  local (69)
executor >  local (75)
executor >  local (83)
executor >  local (83)
executor >  local (84)
executor >  local (85)
executor >  local (85)
executor >  local (86)
executor >  local (87)
executor >  local (88)
executor >  local (92)
executor >  local (92)
executor >  local (94)
executor >  local (94)
executor >  local (94)
executor >  local (95)
executor >  local (95)
executor >  local (95)
executor >  local (96)
executor >  local (102)
executor >  local (103)
executor >  local (107)
executor >  local (110)
executor >  local (116)
executor >  local (117)
executor >  local (118)
executor >  local (119)
executor >  local (120)
executor >  local (123)
executor >  local (123)
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[96/f37051] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [  8%] 6 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 83%] 20 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[4b/787a6c] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [  0%] 0 of 6
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Kraken2 pipeline.
Preparing databases.
Checking custom taxonomy mapping exists
Checking custom kraken2 database exists
Using the bracken dist file within your custom database directory.
Workflow will run indefinitely as no read_limit is set.
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[63/3b0dc6] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [  9%] 7 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[d5/0dce94] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 16%] 1 of 6, failed: 1
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Kraken2 pipeline.
Preparing databases.
Checking custom taxonomy mapping exists
Checking custom kraken2 database exists
Using the bracken dist file within your custom database directory.
Workflow will run indefinitely as no read_limit is set.
Workflow will stop processing files after null reads.
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[7c/1c91d6] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 12%] 9 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[d5/0dce94] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 16%] 1 of 6, failed: 1
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Kraken2 pipeline.
Preparing databases.
Checking custom taxonomy mapping exists
Checking custom kraken2 database exists
Using the bracken dist file within your custom database directory.
Workflow will run indefinitely as no read_limit is set.
Workflow will stop processing files after null reads.
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[60/0e1e53] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 13%] 10 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[d5/0dce94] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 16%] 1 of 6, failed: 1
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Kraken2 pipeline.
Preparing databases.
Checking custom taxonomy mapping exists
Checking custom kraken2 database exists
Using the bracken dist file within your custom database directory.
Workflow will run indefinitely as no read_limit is set.
Workflow will stop processing files after null reads.
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[ba/9e55a4] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 16%] 12 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[d5/0dce94] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 16%] 1 of 6, failed: 1
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Kraken2 pipeline.
Preparing databases.
Checking custom taxonomy mapping exists
Checking custom kraken2 database exists
Using the bracken dist file within your custom database directory.
Workflow will run indefinitely as no read_limit is set.
Workflow will stop processing files after null reads.
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[11/e12f12] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 17%] 13 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[d5/0dce94] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 16%] 1 of 6, failed: 1
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Kraken2 pipeline.
Preparing databases.
Checking custom taxonomy mapping exists
Checking custom kraken2 database exists
Using the bracken dist file within your custom database directory.
Workflow will run indefinitely as no read_limit is set.
Workflow will stop processing files after null reads.
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[44/f602d7] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 20%] 15 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[d5/0dce94] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 16%] 1 of 6, failed: 1
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Kraken2 pipeline.
Preparing databases.
Checking custom taxonomy mapping exists
Checking custom kraken2 database exists
Using the bracken dist file within your custom database directory.
Workflow will run indefinitely as no read_limit is set.
Workflow will stop processing files after null reads.
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[6f/66e9b4] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 21%] 16 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[d5/0dce94] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 16%] 1 of 6, failed: 1
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Kraken2 pipeline.
Preparing databases.
Checking custom taxonomy mapping exists
Checking custom kraken2 database exists
Using the bracken dist file within your custom database directory.
Workflow will run indefinitely as no read_limit is set.
Workflow will stop processing files after null reads.
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[6f/66e9b4] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 21%] 16 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[4b/787a6c] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 33%] 2 of 6, failed: 2
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Note: Empty files or those files whose reads have been discarded after filtering based on read length and/or read quality will not appear in the report and will be excluded from subsequent analysis.
Kraken2 pipeline.
Preparing databases.
Checking custom taxonomy mapping exists
Checking custom kraken2 database exists
Using the bracken dist file within your custom database directory.
Workflow will run indefinitely as no read_limit is set.
Workflow will stop processing files after null reads.
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[d0/a990ef] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 22%] 17 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[4b/787a6c] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 33%] 2 of 6, failed: 2
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Execution cancelled -- Finishing pending tasks before exit

executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[06/130ba9] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 24%] 18 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[4b/787a6c] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 33%] 2 of 6, failed: 2
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Execution cancelled -- Finishing pending tasks before exit
ERROR ~ Error executing process > 'real_time_pipeline:kraken2_client (barcode21_LIB00220)'

Caused by:
  Process `real_time_pipeline:kraken2_client (barcode21_LIB00220)` terminated with an error exit status (2)

Command executed:

executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[15/4046c4] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 27%] 20 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[4b/787a6c] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 33%] 2 of 6, failed: 2
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[96/f37051] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 28%] 21 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[4b/787a6c] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 33%] 2 of 6, failed: 2
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[44/21de9f] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 29%] 22 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[ce/1e7afc] process > real_time_pipeline:kraken_server                                      [  0%] 0 of 1
[4b/787a6c] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 33%] 2 of 6, failed: 2
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Execution cancelled -- Finishing pending tasks before exit
executor >  local (124)
[42/558cdb] process > fastcat (72)                                                          [100%] 74 of 74
[skipped  ] process > prepare_databases:determine_bracken_length                            [100%] 1 of 1, stored: 1 ✔
[f9/72f1cb] process > real_time_pipeline:run_common:getVersions                             [100%] 1 of 1 ✔
[35/f47785] process > real_time_pipeline:run_common:getParams                               [100%] 1 of 1 ✔
[44/21de9f] process > real_time_pipeline:run_common:exclude_host_reads (barcode21_LIB00220) [ 29%] 22 of 74
[7d/6d208e] process > real_time_pipeline:run_common:output (23)                             [ 87%] 21 of 24
[-        ] process > real_time_pipeline:kraken_server                                      -
[4b/787a6c] process > real_time_pipeline:kraken2_client (barcode21_LIB00220)                [ 33%] 2 of 6, failed: 2
[-        ] process > real_time_pipeline:progressive_stats                                  -
[-        ] process > real_time_pipeline:progressive_kraken_reports                         -
[-        ] process > real_time_pipeline:progressive_bracken                                -
[-        ] process > real_time_pipeline:createAbundanceTables                              -
[-        ] process > real_time_pipeline:makeReport                                         -
[3d/3443f3] process > real_time_pipeline:output_results (2)                                 [100%] 2 of 2
[-        ] process > real_time_pipeline:stop_kraken_server                                 -
Execution cancelled -- Finishing pending tasks before exit
ERROR ~ Error executing process > 'real_time_pipeline:kraken2_client (barcode21_LIB00220)'

Caused by:
  Process `real_time_pipeline:kraken2_client (barcode21_LIB00220)` terminated with an error exit status (2)

Command executed:

  if [[ -f stats_unmapped ]]
  then
      stats_file="stats_unmapped" # batch case
  else
  # -L is needed because fastcat_stats is a symlink
      stats_file=$(find -L "stats_unmapped" -name "*.tsv.gz" -exec ls {} +)
  fi
  workflow-glue fastcat_histogram         --sample_id "barcode21_LIB00220"         $stats_file "barcode21_LIB00220.1.json"

  kraken2_client         --port 8080 --host-ip localhost         --report report.txt         --sequence barcode21_LIB00220.unmapped.fastq.gz > "kraken2.assignments.tsv"
  tail -n +1 report.txt > "barcode21_LIB00220.kraken2.report.txt"

Command exit status:
  2

Command output:
  (empty)

Command error:
  [11:53:46 - workflow_glue] Bootstrapping CLI.
  usage: wf-glue fastcat_histogram [-h] [--sample_id SAMPLE_ID] input output
  wf-glue fastcat_histogram: error: the following arguments are required: output

Work dir:
  /SAN/breuerlab/BTRU-scratch/sarah/results/ont/work/d5/0dce94f21b07c0598b3e550794789c

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

No response

nggvs commented 3 months ago

Hi @sarah-buddle , I cannot reproduce your problem, but it may be worthy to take a look on the real_time_analysis_options. You can open an external server with the database and use that one within the pipeline.

Just in case, could you send me the files within the /SAN/breuerlab/BTRU-scratch/sarah/results/ont/work/d5/0dce94f21b07c0598b3e550794789c in particular the .command.sh, .command.err, .command.out, .command.log

Thank you very much in advance!

sarah-buddle commented 3 months ago

Thanks for your response! I'm now running on a different machine (our GridION) and I'm getting the same error. I've tried to make an external kraken2 server according to the following link: https://github.com/epi2me-labs/kraken2-server, and I got the error again. If it's relevant, my barcode directory and file names are different to Minknow's standard output. The commands I used were:

mamba activate kraken2-server
kraken2_server --db $kraken_db_pluspf8

nextflow run epi2me-labs/wf-metagenomics -w ${results}/work \
--fastq ${demux_outdir} \
--classifier kraken2 \
--exclude_host $human_genome \
--database $kraken_db_pluspf8 \
--taxonomy $taxdump \
--ref2taxid $ref2taxid \
--include_kraken2_assignments \
--out_dir ${results} \
--real_time True \
--external_kraken2 True \
--server_threads 1 \
--kraken_clients 1

Here are the contents of the files you asked for that are not empty. .command.out is empty. command.err

[15:27:18 - workflow_glue] Bootstrapping CLI.
usage: wf-glue fastcat_histogram [-h] [--sample_id SAMPLE_ID] input output
wf-glue fastcat_histogram: error: the following arguments are required: output

command.log

[15:27:18 - workflow_glue] Bootstrapping CLI.
usage: wf-glue fastcat_histogram [-h] [--sample_id SAMPLE_ID] input output
wf-glue fastcat_histogram: error: the following arguments are required: output

command.run

#!/bin/bash
# NEXTFLOW TASK: real_time_pipeline:kraken2_client (barcode21_LIB00226)
set -e
set -u
NXF_DEBUG=${NXF_DEBUG:=0}; [[ $NXF_DEBUG > 1 ]] && set -x
NXF_ENTRY=${1:-nxf_main}

nxf_tree() {
    local pid=$1

    declare -a ALL_CHILDREN
    while read P PP;do
        ALL_CHILDREN[$PP]+=" $P"
    done < <(ps -e -o pid= -o ppid=)

    pstat() {
        local x_pid=$1
        local STATUS=$(2> /dev/null < /proc/$1/status grep -E 'Vm|ctxt')

        if [ $? = 0 ]; then
        local  x_vsz=$(echo "$STATUS" | grep VmSize | awk '{print $2}' || echo -n '0')
        local  x_rss=$(echo "$STATUS" | grep VmRSS | awk '{print $2}' || echo -n '0')
        local x_peak=$(echo "$STATUS" | grep -E 'VmPeak|VmHWM' | sed 's/^.*:\s*//' | sed 's/[\sa-zA-Z]*$//' | tr '\n' ' ' || echo -n '0 0')
        local x_pmem=$(awk -v rss=$x_rss -v mem_tot=$mem_tot 'BEGIN {printf "%.0f", rss/mem_tot*100*10}' || echo -n '0')
        local vol_ctxt=$(echo "$STATUS" | grep '\bvoluntary_ctxt_switches' | awk '{print $2}' || echo -n '0')
        local inv_ctxt=$(echo "$STATUS" | grep '\bnonvoluntary_ctxt_switches' | awk '{print $2}' || echo -n '0')
        cpu_stat[x_pid]="$x_pid $x_pmem $x_vsz $x_rss $x_peak $vol_ctxt $inv_ctxt"
        fi
    }

    pwalk() {
        pstat $1
        for i in ${ALL_CHILDREN[$1]:=}; do pwalk $i; done
    }

    pwalk $1
}

nxf_stat() {
    cpu_stat=()
    nxf_tree $1

    declare -a sum=(0 0 0 0 0 0 0 0)
    local pid
    local i
    for pid in "${!cpu_stat[@]}"; do
        local row=(${cpu_stat[pid]})
        [ $NXF_DEBUG = 1 ] && echo "++ stat mem=${row[*]}"
        for i in "${!row[@]}"; do
        if [ $i != 0 ]; then
            sum[i]=$((sum[i]+row[i]))
        fi
        done
    done

    [ $NXF_DEBUG = 1 ] && echo -e "++ stat SUM=${sum[*]}"

    for i in {1..7}; do
        if [ ${sum[i]} -lt ${cpu_peak[i]} ]; then
            sum[i]=${cpu_peak[i]}
        else
            cpu_peak[i]=${sum[i]}
        fi
    done

    [ $NXF_DEBUG = 1 ] && echo -e "++ stat PEAK=${sum[*]}\n"
    nxf_stat_ret=(${sum[*]})
}

nxf_mem_watch() {
    set -o pipefail
    local pid=$1
    local trace_file=.command.trace
    local count=0;
    declare -a cpu_stat=(0 0 0 0 0 0 0 0)
    declare -a cpu_peak=(0 0 0 0 0 0 0 0)
    local mem_tot=$(< /proc/meminfo grep MemTotal | awk '{print $2}')
    local timeout
    local DONE
    local STOP=''

    [ $NXF_DEBUG = 1 ] && nxf_sleep 0.2 && ps fx

    while true; do
        nxf_stat $pid
        if [ $count -lt 10 ]; then timeout=1;
        elif [ $count -lt 120 ]; then timeout=5;
        else timeout=30;
        fi
        read -t $timeout -r DONE || true
        [[ $DONE ]] && break
        if [ ! -e /proc/$pid ]; then
            [ ! $STOP ] && STOP=$(nxf_date)
            [ $(($(nxf_date)-STOP)) -gt 10000 ] && break
        fi
        count=$((count+1))
    done

    echo "%mem=${nxf_stat_ret[1]}"      >> $trace_file
    echo "vmem=${nxf_stat_ret[2]}"      >> $trace_file
    echo "rss=${nxf_stat_ret[3]}"       >> $trace_file
    echo "peak_vmem=${nxf_stat_ret[4]}" >> $trace_file
    echo "peak_rss=${nxf_stat_ret[5]}"  >> $trace_file
    echo "vol_ctxt=${nxf_stat_ret[6]}"  >> $trace_file
    echo "inv_ctxt=${nxf_stat_ret[7]}"  >> $trace_file
}

nxf_write_trace() {
    echo "nextflow.trace/v2"           > $trace_file
    echo "realtime=$wall_time"         >> $trace_file
    echo "%cpu=$ucpu"                  >> $trace_file
    echo "cpu_model=$cpu_model"        >> $trace_file
    echo "rchar=${io_stat1[0]}"        >> $trace_file
    echo "wchar=${io_stat1[1]}"        >> $trace_file
    echo "syscr=${io_stat1[2]}"        >> $trace_file
    echo "syscw=${io_stat1[3]}"        >> $trace_file
    echo "read_bytes=${io_stat1[4]}"   >> $trace_file
    echo "write_bytes=${io_stat1[5]}"  >> $trace_file
}

nxf_trace_mac() {
    local start_millis=$(nxf_date)

    /bin/bash -euo pipefail /realtime_test/epi2me/work/52/e4c46c8ab151500412d3b04aaf1cb9/.command.sh

    local end_millis=$(nxf_date)
    local wall_time=$((end_millis-start_millis))
    local ucpu=''
    local cpu_model=''
    local io_stat1=('' '' '' '' '' '')
    nxf_write_trace
}

nxf_fd() {
    local FD=11
    while [ -e /proc/$$/fd/$FD ]; do FD=$((FD+1)); done
    echo $FD
}

nxf_trace_linux() {
    local pid=$$
    command -v ps &>/dev/null || { >&2 echo "Command 'ps' required by nextflow to collect task metrics cannot be found"; exit 1; }
    local num_cpus=$(< /proc/cpuinfo grep '^processor' -c)
    local cpu_model=$(< /proc/cpuinfo grep '^model name' | head -n 1 | awk 'BEGIN{FS="\t: "} { print $2 }')
    local tot_time0=$(grep '^cpu ' /proc/stat | awk '{sum=$2+$3+$4+$5+$6+$7+$8+$9; printf "%.0f",sum}')
    local cpu_time0=$(2> /dev/null < /proc/$pid/stat awk '{printf "%.0f", ($16+$17)*10 }' || echo -n 'X')
    local io_stat0=($(2> /dev/null < /proc/$pid/io sed 's/^.*:\s*//' | head -n 6 | tr '\n' ' ' || echo -n '0 0 0 0 0 0'))
    local start_millis=$(nxf_date)
    trap 'kill $mem_proc' ERR

    /bin/bash -euo pipefail /realtime_test/epi2me/work/52/e4c46c8ab151500412d3b04aaf1cb9/.command.sh &
    local task=$!

    mem_fd=$(nxf_fd)
    eval "exec $mem_fd> >(nxf_mem_watch $task)"
    local mem_proc=$!

    wait $task

    local end_millis=$(nxf_date)
    local tot_time1=$(grep '^cpu ' /proc/stat | awk '{sum=$2+$3+$4+$5+$6+$7+$8+$9; printf "%.0f",sum}')
    local cpu_time1=$(2> /dev/null < /proc/$pid/stat awk '{printf "%.0f", ($16+$17)*10 }' || echo -n 'X')
    local ucpu=$(awk -v p1=$cpu_time1 -v p0=$cpu_time0 -v t1=$tot_time1 -v t0=$tot_time0 -v n=$num_cpus 'BEGIN { pct=(p1-p0)/(t1-t0)*100*n; printf("%.0f", pct>0 ? pct : 0) }' )

    local io_stat1=($(2> /dev/null < /proc/$pid/io sed 's/^.*:\s*//' | head -n 6 | tr '\n' ' ' || echo -n '0 0 0 0 0 0'))
    local i
    for i in {0..5}; do
        io_stat1[i]=$((io_stat1[i]-io_stat0[i]))
    done

    local wall_time=$((end_millis-start_millis))
    [ $NXF_DEBUG = 1 ] && echo "+++ STATS %CPU=$ucpu TIME=$wall_time I/O=${io_stat1[*]}"

    echo "nextflow.trace/v2"           > $trace_file
    echo "realtime=$wall_time"         >> $trace_file
    echo "%cpu=$ucpu"                  >> $trace_file
    echo "cpu_model=$cpu_model"        >> $trace_file
    echo "rchar=${io_stat1[0]}"        >> $trace_file
    echo "wchar=${io_stat1[1]}"        >> $trace_file
    echo "syscr=${io_stat1[2]}"        >> $trace_file
    echo "syscw=${io_stat1[3]}"        >> $trace_file
    echo "read_bytes=${io_stat1[4]}"   >> $trace_file
    echo "write_bytes=${io_stat1[5]}"  >> $trace_file

    [ -e /proc/$mem_proc ] && eval "echo 'DONE' >&$mem_fd" || true
    wait $mem_proc 2>/dev/null || true
    while [ -e /proc/$mem_proc ]; do nxf_sleep 0.1; done
}

nxf_trace() {
    local trace_file=.command.trace
    touch $trace_file
    if [[ $(uname) = Darwin ]]; then
        nxf_trace_mac
    else
        nxf_trace_linux
    fi
}
nxf_container_env() {
cat << EOF
export PYTHONNOUSERSITE="1"
export JAVA_TOOL_OPTIONS="-Xlog:disable -Xlog:all=warning:stderr"
export PATH="\$PATH:/home/grid/.nextflow/assets/epi2me-labs/wf-metagenomics/bin"
EOF
}

nxf_sleep() {
  sleep $1 2>/dev/null || sleep 1;
}

nxf_date() {
    local ts=$(date +%s%3N);
    if [[ ${#ts} == 10 ]]; then echo ${ts}000
    elif [[ $ts == *%3N ]]; then echo ${ts/\%3N/000}
    elif [[ $ts == *3N ]]; then echo ${ts/3N/000}
    elif [[ ${#ts} == 13 ]]; then echo $ts
    else echo "Unexpected timestamp value: $ts"; exit 1
    fi
}

nxf_env() {
    echo '============= task environment ============='
    env | sort | sed "s/\(.*\)AWS\(.*\)=\(.\{6\}\).*/\1AWS\2=\3xxxxxxxxxxxxx/"
    echo '============= task output =================='
}

nxf_kill() {
    declare -a children
    while read P PP;do
        children[$PP]+=" $P"
    done < <(ps -e -o pid= -o ppid=)

    kill_all() {
        [[ $1 != $$ ]] && kill $1 2>/dev/null || true
        for i in ${children[$1]:=}; do kill_all $i; done
    }

    kill_all $1
}

nxf_mktemp() {
    local base=${1:-/tmp}
    mkdir -p "$base"
    if [[ $(uname) = Darwin ]]; then mktemp -d $base/nxf.XXXXXXXXXX
    else TMPDIR="$base" mktemp -d -t nxf.XXXXXXXXXX
    fi
}

nxf_fs_copy() {
  local source=$1
  local target=$2
  local basedir=$(dirname $1)
  mkdir -p $target/$basedir
  cp -fRL $source $target/$basedir
}

nxf_fs_move() {
  local source=$1
  local target=$2
  local basedir=$(dirname $1)
  mkdir -p $target/$basedir
  mv -f $source $target/$basedir
}

nxf_fs_rsync() {
  rsync -rRl $1 $2
}

nxf_fs_rclone() {
  rclone copyto $1 $2/$1
}

nxf_fs_fcp() {
  fcp $1 $2/$1
}

on_exit() {
    exit_status=${nxf_main_ret:=$?}
    printf -- $exit_status > /realtime_test/epi2me/work/52/e4c46c8ab151500412d3b04aaf1cb9/.exitcode
    set +u
    docker rm $NXF_BOXID &>/dev/null || true
    exit $exit_status
}

on_term() {
    set +e
    docker stop $NXF_BOXID
}

nxf_launch() {
    docker run -i --cpu-shares 1024 --memory 2048m -e "NXF_TASK_WORKDIR" -e "NXF_DEBUG=${NXF_DEBUG:=0}" -v /realtime_test/epi2me/work:/realtime_test/epi2me/work -v /home/grid/.nextflow/assets/epi2me-labs/wf-metagenomics/bin:/home/grid/.nextflow/assets/epi2me-labs/wf-metagenomics/bin -w "$PWD" --user $(id -u):$(id -g) --group-add 100 --network host --name $NXF_BOXID ontresearch/wf-metagenomics:sha44a6dacff5f2001d917b774647bb4cbc1b53bc76 /bin/bash -c "eval $(nxf_container_env); /bin/bash /realtime_test/epi2me/work/52/e4c46c8ab151500412d3b04aaf1cb9/.command.run nxf_trace"
}

nxf_stage() {
    true
    # stage input files
    rm -f barcode21_LIB00226.unmapped.fastq.gz
    rm -f stats_unmapped
    ln -s /realtime_test/epi2me/work/94/a28ec1915b233405d90ad84844710d/barcode21_LIB00226.unmapped.fastq.gz barcode21_LIB00226.unmapped.fastq.gz
    ln -s /realtime_test/epi2me/work/94/a28ec1915b233405d90ad84844710d/stats_unmapped stats_unmapped
}

nxf_unstage() {
    true
    [[ ${nxf_main_ret:=0} != 0 ]] && return
}

nxf_main() {
    trap on_exit EXIT
    trap on_term TERM INT USR2
    trap '' USR1

    [[ "${NXF_CHDIR:-}" ]] && cd "$NXF_CHDIR"
    export NXF_BOXID="nxf-$(dd bs=18 count=1 if=/dev/urandom 2>/dev/null | base64 | tr +/ 0A | tr -d '\r\n')"
    NXF_SCRATCH=''
    [[ $NXF_DEBUG > 0 ]] && nxf_env
    touch /realtime_test/epi2me/work/52/e4c46c8ab151500412d3b04aaf1cb9/.command.begin
    set +u
    set -u
    [[ $NXF_SCRATCH ]] && cd $NXF_SCRATCH
    export NXF_TASK_WORKDIR="$PWD"
    nxf_stage

    set +e
    (set -o pipefail; (nxf_launch | tee .command.out) 3>&1 1>&2 2>&3 | tee .command.err) &
    pid=$!
    wait $pid || nxf_main_ret=$?
    nxf_unstage
}

$NXF_ENTRY

command.sh

#!/bin/bash -euo pipefail
if [[ -f stats_unmapped ]]
then
    stats_file="stats_unmapped" # batch case
else
# -L is needed because fastcat_stats is a symlink
    stats_file=$(find -L "stats_unmapped" -name "*.tsv.gz" -exec ls {} +)
fi
workflow-glue fastcat_histogram         --sample_id "barcode21_LIB00226"         $stats_file "barcode21_LIB00226.1.json"

kraken2_client         --port 8080 --host-ip localhost         --report report.txt         --sequence barcode21_LIB00226.unmapped.fastq.gz > "kraken2.assignments.tsv"
tail -n +1 report.txt > "barcode21_LIB00226.kraken2.report.txt"

exitcode

2
nggvs commented 2 months ago

Hi @sarah-buddle , If you open the server externally, you'd need to provide the port and the host, as defaults of kraken2_server are for the local host.

kraken2_server --db k2_standard_08gb_20240605/ --port xxxx --host-ip xxxx

before running the nextflow pipeline could you make sure it works with a small test using just the command to classifiy it?

kraken2_client -p xxxx -r report.txt -s reads.fastq.gz --host-ip xxxx
sarah-buddle commented 2 months ago

The above commands run successfully for me, using an external server running on the gridion. However, when I try to run the nextflow command on the directory containing the same file, I am still getting the same error as in my original message. I start the kraken2 server in the background, and then run the nextflow pipeline. I have adjusted the command to include the IP of the external kraken server. nextflow run epi2me-labs/wf-metagenomics -w ${outdir}/work --fastq $input_dir --classifier kraken2 --exclude_host $human_genome --database $kraken_db --taxonomy $taxdump --ref2taxid $ref2taxid --include_kraken2_assignments --out_dir $outdir --real_time True --external_kraken2 True --port 8080 --host $ip --server_threads 1 --kraken_clients 1

nggvs commented 1 month ago

Hi @sarah-buddle, I had been able to reproduce it :) so I I'll investigate now the issue to find a proper solution. Thank you very much for your patience!

nggvs commented 1 month ago

Hi @sarah-buddle , Could you try to run the prerelease branch? Prerelease branch is not stable and may change. I will let you know once the stable one is released so that you can change to that one, but I'd like to know if this solves your problem. Please let me know if you find something weird.

nggvs commented 3 weeks ago

Hi @sarah-buddle , Is this issue still persisting with the prerelease branch? Thank you very much in advance!

sarah-buddle commented 3 weeks ago

Sorry for the delayed reply - it's now working with the prerelease branch, thank you!

nggvs commented 2 weeks ago

Thank you for using the workflow! Glad to hear is already solved. I'll let you know once it is tagged. Meanwhile, I'll close the issue as it has been already solved. Please feel free to open a new issue if there is anything else. Thank you!