PacificBiosciences / HiFi-16S-workflow

Nextflow pipeline to analyze PacBio HiFi full-length 16S data
BSD 3-Clause Clear License
61 stars 15 forks source link

Error running test sample command at pb16S:dada2_denoise step #30

Closed marinachen closed 1 year ago

marinachen commented 1 year ago

Hi,

I'm running this on HPC with singularity. I can't tell what caused this termination at the denoise step. I was following the documentation and everything before the test sample step works fine for me. Please help! Thank you!

(nextflow) marinachen@hutlab12:pb-16S-nf nextflow run main.nf --input test_data/test_sample.tsv \
>     --metadata test_data/test_metadata.tsv -profile singularity \
>     --outdir results
N E X T F L O W  ~  version 22.10.6
Launching `main.nf` [cheesy_shirley] DSL2 - revision: 4df4423694
Only 1 sample. min_asv_sample and min_asv_totalfreq set to 0.

  Parameters set for pb-16S-nf pipeline for PacBio HiFi 16S
  =========================================================
  Number of samples in samples TSV: 1
  Filter input reads above Q: 20
  Trim primers with cutadapt: Yes
  Forward primer: AGRGTTYGATYMTGGCTCAG
  Reverse primer: AAGTCGTAACAAGGTARCY
  Minimum amplicon length filtered in DADA2: 1000
  Maximum amplicon length filtered in DADA2: 1600
  maxEE parameter for DADA2 filterAndTrim: 2
  minQ parameter for DADA2 filterAndTrim: 0
  Pooling method for DADA2 denoise process: pseudo
  Minimum number of samples required to keep any ASV: 0
  Minimum number of reads required to keep any ASV: 0 
  Taxonomy sequence database for VSEARCH: /net/hutlab12/srv/export/hutlab12_nobackup/share_root/users/mchen/pb-16S-nf/databases/GTDB_ssu_all_r207.qza
  Taxonomy annotation database for VSEARCH: /net/hutlab12/srv/export/hutlab12_nobackup/share_root/users/mchen/pb-16S-nf/databases/GTDB_ssu_all_r207.taxonomy.qza
  Skip Naive Bayes classification: false
  SILVA database for Naive Bayes classifier: /net/hutlab12/srv/export/hutlab12_nobackup/share_root/users/mchen/pb-16S-nf/databases/silva_nr99_v138.1_wSpecies_train_set.fa.gz
  GTDB database for Naive Bayes classifier: /net/hutlab12/srv/export/hutlab12_nobackup/share_root/users/mchen/pb-16S-nf/databases/GTDB_bac120_arc53_ssu_r207_fullTaxo.fa.gz
  RefSeq + RDP database for Naive Bayes classifier: /net/hutlab12/srv/export/hutlab12_nobackup/share_root/users/mchen/pb-16S-nf/databases/RefSeq_16S_6-11-20_RDPv16_fullTaxo.fa.gz
  VSEARCH maxreject: 100
  VSEARCH maxaccept: 100
  VSEARCH perc-identity: 0.97
  QIIME 2 rarefaction curve sampling depth: null
  Number of threads specified for cutadapt: 16
  Number of threads specified for DADA2: 8
  Number of threads specified for VSEARCH: 8
  Script location for HTML report generation: /net/hutlab12/srv/export/hutlab12_nobackup/share_root/users/mchen/pb-16S-nf/scripts/visualize_biom.Rmd
  Container enabled via docker/singularity: true
  Version of Nextflow pipeline: 0.5

executor >  Local (10)
[3b/249a20] process > pb16S:write_log                   [100%] 1 of 1 ✔
[ed/cb222c] process > pb16S:QC_fastq (1)                [100%] 1 of 1 ✔
[01/23d671] process > pb16S:cutadapt (1)                [100%] 1 of 1 ✔
[fa/338989] process > pb16S:QC_fastq_post_trim (1)      [100%] 1 of 1 ✔
[a7/536fc4] process > pb16S:collect_QC                  [100%] 1 of 1 ✔
[1d/11a125] process > pb16S:prepare_qiime2_manifest (1) [100%] 1 of 1 ✔
[b9/7da1e9] process > pb16S:merge_sample_manifest       [100%] 1 of 1 ✔
[10/525b2c] process > pb16S:import_qiime2 (1)           [100%] 1 of 1 ✔
[05/e05980] process > pb16S:demux_summarize (1)         [  0%] 0 of 1
[2e/98738f] process > pb16S:dada2_denoise (1)           [  0%] 0 of 1
[-        ] process > pb16S:mergeASV                    -
[-        ] process > pb16S:filter_dada2                -
[-        ] process > pb16S:dada2_qc                    -
[-        ] process > pb16S:qiime2_phylogeny_diversity  -
[-        ] process > pb16S:dada2_rarefaction           -
[-        ] process > pb16S:class_tax                   -
[-        ] process > pb16S:dada2_assignTax             -
[-        ] process > pb16S:export_biom                 -
[-        ] process > pb16S:barplot_nb                  -
[-        ] process > pb16S:barplot                     -
[-        ] process > pb16S:html_rep                    -
[-        ] process > pb16S:krona_plot                  -
Error executing process > 'pb16S:dada2_denoise (1)'

Caused by:
  Process `pb16S:dada2_denoise (1)` terminated with an error exit status (1)

Command executed:

  # Use custom script that can skip primer trimming
  mkdir -p dada2_custom_script
  cp run_dada_ccs.R dada2_custom_script/run_dada_ccs_original.R
  sed 's/minQ\ =\ 0/minQ=0/g' dada2_custom_script/run_dada_ccs_original.R >     dada2_custom_script/run_dada_ccs.R
  chmod +x dada2_custom_script/run_dada_ccs.R
  export PATH="./dada2_custom_script:$PATH"
  which run_dada_ccs.R
  qiime dada2 denoise-ccs --i-demultiplexed-seqs samples.qza     --o-table dada2-ccs_table.qza     --o-representative-sequences dada2-ccs_rep.qza     --o-denoising-stats dada2-ccs_stats.qza     --p-min-len 1000 --p-max-len 1600     --p-max-ee 2     --p-front 'none'     --p-adapter 'none'     --p-n-threads 8     --p-pooling-method 'pseudo'

Command exit status:
  1

Command output:
executor >  Local (10)
[3b/249a20] process > pb16S:write_log                   [100%] 1 of 1 ✔
[ed/cb222c] process > pb16S:QC_fastq (1)                [100%] 1 of 1 ✔
[01/23d671] process > pb16S:cutadapt (1)                [100%] 1 of 1 ✔
[fa/338989] process > pb16S:QC_fastq_post_trim (1)      [100%] 1 of 1 ✔
[a7/536fc4] process > pb16S:collect_QC                  [100%] 1 of 1 ✔
[1d/11a125] process > pb16S:prepare_qiime2_manifest (1) [100%] 1 of 1 ✔
[b9/7da1e9] process > pb16S:merge_sample_manifest       [100%] 1 of 1 ✔
[10/525b2c] process > pb16S:import_qiime2 (1)           [100%] 1 of 1 ✔
[-        ] process > pb16S:demux_summarize (1)         -
[2e/98738f] process > pb16S:dada2_denoise (1)           [100%] 1 of 1, failed: 1 ✘
[-        ] process > pb16S:mergeASV                    -
[-        ] process > pb16S:filter_dada2                -
[-        ] process > pb16S:dada2_qc                    -
[-        ] process > pb16S:qiime2_phylogeny_diversity  -
[-        ] process > pb16S:dada2_rarefaction           -
[-        ] process > pb16S:class_tax                   -
[-        ] process > pb16S:dada2_assignTax             -
[-        ] process > pb16S:export_biom                 -
[-        ] process > pb16S:barplot_nb                  -
[-        ] process > pb16S:barplot                     -
[-        ] process > pb16S:html_rep                    -
[-        ] process > pb16S:krona_plot                  -
Error executing process > 'pb16S:dada2_denoise (1)'

Caused by:
  Process `pb16S:dada2_denoise (1)` terminated with an error exit status (1)

Command executed:

  # Use custom script that can skip primer trimming
  mkdir -p dada2_custom_script
  cp run_dada_ccs.R dada2_custom_script/run_dada_ccs_original.R
  sed 's/minQ\ =\ 0/minQ=0/g' dada2_custom_script/run_dada_ccs_original.R >     dada2_custom_script/run_dada_ccs.R
  chmod +x dada2_custom_script/run_dada_ccs.R
  export PATH="./dada2_custom_script:$PATH"
  which run_dada_ccs.R
  qiime dada2 denoise-ccs --i-demultiplexed-seqs samples.qza     --o-table dada2-ccs_table.qza     --o-representative-sequences dada2-ccs_rep.qza     --o-denoising-stats dada2-ccs_stats.qza     --p-min-len 1000 --p-max-len 1600     --p-max-ee 2     --p-front 'none'     --p-adapter 'none'     --p-n-threads 8     --p-pooling-method 'pseudo'

Command exit status:
  1

Command output:
  ./dada2_custom_script/run_dada_ccs.R

Command error:
  /n/helmod/apps/lmod/lmod/init/bash: line 15: __lmod_vx: unbound variable
  ./dada2_custom_script/run_dada_ccs.R
  QIIME is caching your current deployment for improved performance. This may take a few moments and should only happen once per deployment.
  QIIME is caching your current deployment for improved performance. This may take a few moments and should only happen once per deployment.
  Traceback (most recent call last):
    File "/opt/conda/envs/qime2-2022.2/bin/qiime", line 11, in <module>
      sys.exit(qiime())
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/site-packages/click/core.py", line 829, in __call__
      return self.main(*args, **kwargs)
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/site-packages/click/core.py", line 782, in main
      rv = self.invoke(ctx)
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/site-packages/click/core.py", line 1254, in invoke
      cmd_name, cmd, args = self.resolve_command(ctx, args)
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/site-packages/click/core.py", line 1297, in resolve_command
      cmd = self.get_command(ctx, cmd_name)
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/site-packages/q2cli/commands.py", line 100, in get_command
      plugin = self._plugin_lookup[name]
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/site-packages/q2cli/commands.py", line 76, in _plugin_lookup
      import q2cli.core.cache
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/site-packages/q2cli/core/cache.py", line 285, in <module>
      CACHE = DeploymentCache()
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/site-packages/q2cli/core/cache.py", line 61, in __init__
      self._state = self._get_cached_state(refresh=refresh)
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/site-packages/q2cli/core/cache.py", line 128, in _get_cached_state
      return json.load(fh, object_hook=decoder)
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/json/__init__.py", line 293, in load
      return loads(fp.read(),
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/json/__init__.py", line 370, in loads
      return cls(**kw).decode(s)
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/json/decoder.py", line 337, in decode
      obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    File "/opt/conda/envs/qime2-2022.2/lib/python3.8/json/decoder.py", line 355, in raw_decode
      raise JSONDecodeError("Expecting value", s, err.value) from None
  json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Work dir:
  /net/hutlab12/srv/export/hutlab12_nobackup/share_root/users/mchen/pb-16S-nf/work/2e/98738f98210e9901351a8d63d1e403

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
proteinosome commented 1 year ago

Hi @marinachen I've never seen this error before and I cannot reproduce this. This is the test sample, right? Can you try running it again and see perhaps the error disappears? If not, can you follow the "Tip" in the last line, go to the workdir /net/hutlab12/srv/export/hutlab12_nobackup/share_root/users/mchen/pb-16S-nf/work/2e/98738f98210e9901351a8d63d1e403 and type bash .command.run and see if the command is able to complete?

Thanks.

marinachen commented 1 year ago

Hi @proteinosome thank you for the suggestion! Yes this was the test sample step. It indeed disappears when I re-ran it, surprisingly.. I will try to run through your example ATCC mock community data to see if this or something else may occur.

I have another question that I see others have brought up in other issues before regarding the dag.overwrite problem (please see below). I'm currently just doing what you said about deleting or renaming the report_results folder every time before re-running, but do wonder if there's a better solution now? Thank you!

N E X T F L O W  ~  version 22.10.6
Launching `main.nf` [maniac_brenner] DSL2 - revision: 4df4423694
DAG file already exists: /net/hutlab12/srv/export/hutlab12_nobackup/share_root/users/mchen/pb-16S-nf/report_results/dag.html -- enable `dag.overwrite` in your config file to overwrite existing DAG files
proteinosome commented 1 year ago

Hi @marinachen good to hear that.

For the DAG issue, the most straightforward way is to add overwrite = true into the DAG scope in the config file nextflow.config. E.g. it looks like this after modifying. I will probably set this as default in the next release.

dag {
    enabled = true
    file = "report_$params.outdir/dag.html"
    overwrite = true
}