Error when making reports

yanchunzhang commented 10 months ago

Operating System

CentOS 7

Other Linux

No response

Workflow Version

v1.0.0

Workflow Execution

Command line

EPI2ME Version

No response

CLI command run

nextflow run epi2me-labs/wf-16s --fastq wf-16s-demo/test_data/ -profile standard -without-docker

Workflow Execution - CLI Execution Profile

None

What happened?

The pipeline faild at the last step 'making report'

Relevant log output

Error executing process > 'kraken_pipeline:makeReport (1)'

Caused by:
Process kraken_pipeline:makeReport (1) terminated with an error exit status (1)

Command executed:

workflow-glue report "wf-16s-report.html" --workflow_name wf-16s --versions versions --params params.json --read_stats read_stats/* --lineages lineages --abundance_table "abundance_table_genus.tsv" --taxonomic_rank "G" --pipeline "kraken2" --abundance_threshold "1" --n_taxa_barplot "8"

Command exit status:
1

Command output:
(empty)

Command error:
[15:02:34 - workflow_glue] Starting entrypoint.
Traceback (most recent call last):
File "/hpc/users/zhangy40/.nextflow/assets/epi2me-labs/wf-16s/bin/workflow-glue", line 7, in
cli()
File "/hpc/users/zhangy40/.nextflow/assets/epi2me-labs/wf-16s/wf-metagenomics/bin/workflow_glue/init.py", line 72, in cli
args.func(args)
File "/hpc/users/zhangy40/.nextflow/assets/epi2me-labs/wf-16s/wf-metagenomics/bin/workflow_glue/report.py", line 114, in main
SeqSummary(args.read_stats)
File "/hpc/users/zhangy40/schzrnas/softwares/conda/env/nf16s/lib/python3.10/site-packages/ezcharts/components/fastcat.py", line 47, in init
df_all = load_stats(seq_summary, format='fastcat')
File "/hpc/users/zhangy40/schzrnas/softwares/conda/env/nf16s/lib/python3.10/site-packages/ezcharts/components/fastcat.py", line 337, in load_stats
if os.path.isdir(stat):
File "/hpc/users/zhangy40/schzrnas/softwares/conda/env/nf16s/lib/python3.10/genericpath.py", line 42, in isdir
st = os.stat(s)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not list

Work dir:
/sc/arion/projects/schzrnas/zhangy40/intratumor_bacteria/010324_16s/processed/demulti_trim/work/5b/2ac0627b4dd170ec04b2b94b1186b8

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

Application activity log entry

No response

osilander commented 8 months ago

Also fails during report generating step (abundance tables generated fine) but as far as I can tell only when using a sample sheet.

executor >  slurm (181)
[c5/827031] process > validate_sample_sheet                          [100%] 1 of 1, cached: 1 ✔
[b6/89f166] process > fastcat (85)                                   [100%] 89 of 89 ✔
[skipped  ] process > prepare_databases:download_unpack_taxonomy     [100%] 1 of 1, stored: 1 ✔
[skipped  ] process > prepare_databases:download_reference_ref2taxid [100%] 1 of 1, stored: 1 ✔
[a3/ace151] process > minimap_pipeline:run_common:getVersions        [100%] 1 of 1, cached: 1 ✔
[b9/13634a] process > minimap_pipeline:run_common:getParams          [100%] 1 of 1, cached: 1 ✔
[25/baee67] process > minimap_pipeline:minimap (119C)                [100%] 89 of 89 ✔
[67/a62a4f] process > minimap_pipeline:createAbundanceTables         [100%] 1 of 1 ✔
[1f/c3f2e4] process > minimap_pipeline:makeReport (1)                [100%] 1 of 1, failed: 1 ✘
[33/e21ba8] process > minimap_pipeline:output_results (3)            [100%] 3 of 3, cached: 2 ✔
ERROR ~ Error executing process > 'minimap_pipeline:makeReport (1)'

Caused by:
  Process `minimap_pipeline:makeReport (1)` terminated with an error exit status (1)

Command executed:

  workflow-glue report         "wf-16s-report.html"         --workflow_name wf-16s         --versions versions         --params params.json         --read_stats read_stats/*         --lineages lineages         --abundance_table "abundance_table_genus.tsv"         --taxonomic_rank "G"         --pipeline "minimap2"         --abundance_threshold "1"        --n_taxa_barplot "9"

Command exit status:
  1

Command output:
  (empty)

Command error:
  [09:04:50 - matplotlib] Matplotlib created a temporary cache directory at /dev/shm/jobs/44149314/matplotlib-446td809 because the default path (/home/osilande/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
  [09:04:51 - matplotlib.font_manager] generated new fontManager
  /home/osilande/.nextflow/assets/epi2me-labs/wf-16s/bin/workflow_glue/__init__.py:30: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
    logger.warn(f"Could not load {name} due to missing module {e.name}")
  [09:04:52 - workflow_glue] Could not load abundance_tables due to missing module anytree
  [09:04:52 - workflow_glue] Starting entrypoint.
  Traceback (most recent call last):
    File "/home/osilande/.nextflow/assets/epi2me-labs/wf-16s/bin/workflow-glue", line 7, in <module>
      cli()
    File "/home/osilande/.nextflow/assets/epi2me-labs/wf-16s/bin/workflow_glue/__init__.py", line 72, in cli
      args.func(args)
    File "/home/osilande/.nextflow/assets/epi2me-labs/wf-16s/bin/workflow_glue/report.py", line 121, in main
      sample_reads = ezc.barplot(data=report_utils.per_sample_stats(
    File "/home/osilande/.nextflow/assets/epi2me-labs/wf-16s/bin/workflow_glue/report_utils/report_utils.py", line 143, in per_sample_stats
      return df_allstats.sort_values(by=['sample_name'])
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper
      return func(*args, **kwargs)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/frame.py", line 6923, in sort_values
      indexer = nargsort(
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/core/sorting.py", line 438, in nargsort
      indexer = non_nan_idx[non_nans.argsort(kind=kind)]
  TypeError: '<' not supported between instances of 'str' and 'int'

Work dir:
  /scale_wlg_nobackup/filesets/nobackup/uoa03387/AG1268/work/1f/c3f2e4e42b56b4dc22498f64e5cc6f

Compare to below, the same sample but no sample sheet and thus with no step [c5/827031] process > validate_sample_sheet [100%] 1 of 1, cached: 1 ✔)

executor >  slurm (182)
[d1/617b24] process > fastcat (83)                                   [100%] 89 of 89 ✔
[skipped  ] process > prepare_databases:download_unpack_taxonomy     [100%] 1 of 1, stored: 1 ✔
[skipped  ] process > prepare_databases:download_reference_ref2taxid [100%] 1 of 1, stored: 1 ✔
[a3/ace151] process > minimap_pipeline:run_common:getVersions        [100%] 1 of 1, cached: 1 ✔
[b9/13634a] process > minimap_pipeline:run_common:getParams          [100%] 1 of 1, cached: 1 ✔
[86/ace595] process > minimap_pipeline:minimap (barcode65)           [100%] 89 of 89 ✔
[a6/de756e] process > minimap_pipeline:createAbundanceTables         [100%] 1 of 1 ✔
[64/1db8f3] process > minimap_pipeline:makeReport (1)                [100%] 1 of 1 ✔
[23/cbc45d] process > minimap_pipeline:output_results (4)            [100%] 4 of 4, cached: 2 ✔
Completed at: 01-Mar-2024 08:43:15
Duration    : 1h 31m 19s
CPU hours   : 252.8 (0% cached)
Succeeded   : 182
Cached      : 4

nggvs commented 8 months ago

Hi @osilander , Could you open a different issue for your problem as it seems to be different from the initial one and that help me to keep track on the different issues. The error of the latest report is 137 which usually is related with not having enough memory, if you have enough memory, then you can use an extra file to override the default value. I'll provide you more info in the issue you open.

Thank you very much for using the pipeline!

osilander commented 8 months ago

Should I also open a different issue for the report above?

Jorn-Bethke commented 8 months ago

Hi I just updated the wf to v1.1.1, and got the same error once the wf runs minimap_pipeline::makeReport.

This is epi2me-labs/wf-16s v1.1.1.

Checking inputs. Searching input for [.fastq, .fastq.gz, .fq, .fq.gz] files. Minimap2 pipeline. Preparing databases. Using default taxonomy database. Using a default database. [skipping] Stored process > prepare_databases:download_unpack_taxonomy [skipping] Stored process > prepare_databases:download_reference_ref2taxid [c2/066821] Submitted process > fastcat (4) [bc/681b96] Submitted process > fastcat (7) [9a/38f988] Submitted process > minimap_pipeline:run_common:getParams [0a/28d2c6] Submitted process > fastcat (9) [9f/1c04d4] Submitted process > fastcat (11) [4e/edc2d7] Submitted process > fastcat (8) [04/bb9d0f] Submitted process > minimap_pipeline:run_common:getVersions [05/bd7879] Submitted process > fastcat (10) [c8/ad8e39] Submitted process > fastcat (2) [bc/efc61d] Submitted process > fastcat (5) [4b/118e79] Submitted process > fastcat (6) [72/1ea865] Submitted process > fastcat (3) [e7/e7db87] Submitted process > fastcat (1) [3d/7646b3] Submitted process > fastcat (12) [df/5cf77f] Submitted process > minimap_pipeline:output_results (1) [ba/018fe3] Submitted process > fastcat (13) [fc/dd2b41] Submitted process > fastcat (14) [ff/a790c2] Submitted process > fastcat (15) [12/59d0d0] Submitted process > minimap_pipeline:output_results (2) [26/77df73] Submitted process > fastcat (16) [9b/7f5b21] Submitted process > minimap_pipeline:minimap (barcode16) [87/2658e8] Submitted process > minimap_pipeline:minimap (barcode14) [b9/7d6447] Submitted process > minimap_pipeline:minimap (barcode13) [c7/43cf95] Submitted process > minimap_pipeline:minimap (barcode09) [42/106603] Submitted process > minimap_pipeline:minimap (barcode15) [4b/69dddb] Submitted process > minimap_pipeline:minimap (barcode02) [1d/8bfa97] Submitted process > minimap_pipeline:minimap (barcode06) [7a/0732b9] Submitted process > minimap_pipeline:minimap (barcode08) [7a/bd8042] Submitted process > minimap_pipeline:minimap (barcode03) [ca/1e2bd5] Submitted process > minimap_pipeline:minimap (barcode01) [22/c79408] Submitted process > minimap_pipeline:minimap (barcode17) [bf/0de3c6] Submitted process > minimap_pipeline:minimap (barcode04) [5b/ec85a1] Submitted process > minimap_pipeline:minimap (barcode18) [36/ee45ad] Submitted process > minimap_pipeline:minimap (barcode21) [5f/968d43] Submitted process > minimap_pipeline:minimap (barcode23) [72/6496f2] Submitted process > minimap_pipeline:minimap (barcode24) [e6/5704fb] Submitted process > minimap_pipeline:createAbundanceTables [46/7f2474] Submitted process > minimap_pipeline:output_results (3) [eb/486250] Submitted process > minimap_pipeline:makeReport (1) ERROR ~ Error executing process > 'minimap_pipeline:makeReport (1)' Caused by: Process minimap_pipeline:makeReport (1) terminated with an error exit status (1) Command executed: workflow-glue report "wf-16s-report.html" --workflow_name wf-16s --versions versions --params params.json --read_stats read_stats/ --lineages lineages --abundance_table "abundance_table_genus.tsv" --taxonomic_rank "G" --pipeline "minimap2" --abundance_threshold "1" --n_taxa_barplot "9" Command exit status: 1 Command output: (empty) Command error: File "/home/epi2melabs/conda/lib/python3.8/site-packages/ezcharts/components/fastcat.py", line 374, in load_stats df = pd.read_csv( File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/util/_decorators.py", line 211, in wrapper return func(args, *kwargs) File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper return func(args, **kwargs) File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 950, in read_csv return _read(filepath_or_buffer, kwds) File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 611, in _read return parser.read(nrows) File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1778, in read ) = self._engine.read( # type: ignore[attr-defined] File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 230, in read chunks = self._reader.read_low_memory(nrows) File "pandas/_libs/parsers.pyx", line 808, in pandas._libs.parsers.TextReader.read_low_memory File "pandas/_libs/parsers.pyx", line 890, in pandas._libs.parsers.TextReader._read_rows File "pandas/_libs/parsers.pyx", line 1037, in pandas._libs.parsers.TextReader._convert_column_data File "pandas/_libs/parsers.pyx", line 1130, in pandas._libs.parsers.TextReader._convert_tokens ValueError: invalid literal for int() with base 10: 'input'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/mnt/c/Users/jornb/epi2melabs/workflows/epi2me-labs/wf-16s/bin/workflow-glue", line 7, in cli() File "/mnt/c/Users/jornb/epi2melabs/workflows/epi2me-labs/wf-16s/wf-metagenomics/bin/workflow_glue/init.py", line 72, in cli args.func(args) File "/mnt/c/Users/jornb/epi2melabs/workflows/epi2me-labs/wf-16s/wf-metagenomics/bin/workflow_glue/report.py", line 114, in main SeqSummary(args.read_stats) File "/home/epi2melabs/conda/lib/python3.8/site-packages/ezcharts/components/fastcat.py", line 50, in init df_all = load_stats(seq_summary, format='bamstats') File "/home/epi2melabs/conda/lib/python3.8/site-packages/ezcharts/components/fastcat.py", line 374, in load_stats df = pd.read_csv( File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/util/_decorators.py", line 211, in wrapper return func(*args, *kwargs) File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper return func(args, kwargs) File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 950, in read_csv return _read(filepath_or_buffer, kwds) File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 605, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1442, in init self._engine = self._make_engine(f, self.engine) File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1753, in _make_engine return mapping[engine](f, **self.options) File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 135, in init self._validate_usecols_names(usecols, self.orig_names) File "/home/epi2melabs/conda/lib/python3.8/site-packages/pandas/io/parsers/base_parser.py", line 917, in _validate_usecols_names raise ValueError( ValueError: Usecols do not match columns, columns expected but not found: ['ref_coverage', 'name', 'acc', 'coverage', 'ref'] Work dir: /mnt/c/Users/jornb/epi2melabs/instances/wf-16s_01HRCWSZTM8848P11XBF38JCJX/work/eb/486250e189d591c93ced5a7908743d Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run -- Check '/mnt/c/Users/jornb/epi2melabs/instances/wf-16s_01HRCWSZTM8848P11XBF38JCJX/nextflow.log' file for details

nggvs commented 8 months ago

Hi @Jorn-Bethke, Could you open a new issue for your error as it seems to be a different one? Thank you very much in advance for using the workflow!

Jorn-Bethke commented 8 months ago

Hi @nggvs thanks for the update to v 1.1.2 problem solved when generating reports, working fine!

nggvs commented 7 months ago

Oh those are good news! Glad to hear that! I'll close the issue in that case. Please feel free to open a new one is you detect something else is happening. Thank you for using the workflow!

epi2me-labs / wf-16s