nf-core / fetchngs

Pipeline to fetch metadata and raw FastQ files from public databases
https://nf-co.re/fetchngs
MIT License
144 stars 69 forks source link

TypeError: unsupported operand type(s) for |: 'dict' and 'dict' #181

Closed VangelisTheodorakis closed 7 months ago

VangelisTheodorakis commented 1 year ago

Somehow the pipeline fails at the very end when I run the following command:

(vangelis-nextflow) [theodora@ouga03 ghga-rnaseq-harmonization]$ ghga-rnaseq-harmonization]$ nextflow run fetchngs -profile mamba -params-file ./geuvadis-fetchngs-params/params.yaml

And params files looks like this:

input: '/s/project/ghga-rnaseq-harmonization/geuvadis-fetchngs-params/geuvadis_rnaseq_ids_0_to_100.csv'
outdir: '/s/project/ghga-rnaseq-harmonization/geuvadis-samples-fastq/'
nf_core_pipeline: 'rnaseq'
force_sratools_download: true

It clearly seems that the tool is trying to run with a python env which is < 3.9 thus not supporting operand with dictionaries, however I run the command from a conda env that has python 3.9. Any ideas?


ERROR ~ Error executing process > 'NFCORE_FETCHNGS:SRA:CUSTOM_DUMPSOFTWAREVERSIONS (1)'

Caused by:
  Process `NFCORE_FETCHNGS:SRA:CUSTOM_DUMPSOFTWAREVERSIONS (1)` terminated with an error exit status (1)

Command executed [/data/ceph/hdd/project/node_05/ghga-rnaseq-harmonization/fetchngs/./workflows/../modules/nf-core/custom/dumpsoftwareversions/templates/dumpsoftwareversions.py]:

  #!/usr/bin/env python

  """Provide functions to merge multiple versions.yml files."""

  import yaml
  import platform
  from textwrap import dedent

  def _make_versions_html(versions):
      """Generate a tabular HTML output of all versions for MultiQC."""
      html = [
          dedent(
              """\
              <style>
              #nf-core-versions tbody:nth-child(even) {
                  background-color: #f2f2f2;
              }
              </style>
              <table class="table" style="width:100%" id="nf-core-versions">
                  <thead>
                      <tr>
                          <th> Process Name </th>
                          <th> Software </th>
                          <th> Version  </th>
                      </tr>
                  </thead>
              """
          )
      ]
      for process, tmp_versions in sorted(versions.items()):
          html.append("<tbody>")
          for i, (tool, version) in enumerate(sorted(tmp_versions.items())):
              html.append(
                  dedent(
                      f"""\
                      <tr>
                          <td><samp>{process if (i == 0) else ''}</samp></td>
                          <td><samp>{tool}</samp></td>
                          <td><samp>{version}</samp></td>
                      </tr>
                      """
                  )
              )
          html.append("</tbody>")
      html.append("</table>")
      return "\n".join(html)

  def main():
      """Load all version files and generate merged output."""
      versions_this_module = {}
      versions_this_module["NFCORE_FETCHNGS:SRA:CUSTOM_DUMPSOFTWAREVERSIONS"] = {
          "python": platform.python_version(),
          "yaml": yaml.__version__,
      }

      with open("collated_versions.yml") as f:
          versions_by_process = yaml.load(f, Loader=yaml.BaseLoader) | versions_this_module

      # aggregate versions by the module name (derived from fully-qualified process name)
      versions_by_module = {}
      for process, process_versions in versions_by_process.items():
          module = process.split(":")[-1]
          try:
              if versions_by_module[module] != process_versions:
                  raise AssertionError(
                      "We assume that software versions are the same between all modules. "
                      "If you see this error-message it means you discovered an edge-case "
                      "and should open an issue in nf-core/tools. "
                  )
          except KeyError:
              versions_by_module[module] = process_versions

      versions_by_module["Workflow"] = {
          "Nextflow": "23.04.1",
          "nf-core/fetchngs": "1.10.0",
      }

      versions_mqc = {
          "id": "software_versions",
          "section_name": "nf-core/fetchngs Software Versions",
          "section_href": "https://github.com/nf-core/fetchngs",
          "plot_type": "html",
          "description": "are collected at run time from the software output.",
          "data": _make_versions_html(versions_by_module),
      }

      with open("software_versions.yml", "w") as f:
          yaml.dump(versions_by_module, f, default_flow_style=False)
      with open("software_versions_mqc.yml", "w") as f:
          yaml.dump(versions_mqc, f, default_flow_style=False)

      with open("versions.yml", "w") as f:
          yaml.dump(versions_this_module, f, default_flow_style=False)

  if __name__ == "__main__":
      main()

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File ".command.sh", line 101, in <module>
      main()
    File ".command.sh", line 61, in main
      versions_by_process = yaml.load(f, Loader=yaml.BaseLoader) | versions_this_module
  TypeError: unsupported operand type(s) for |: 'dict' and 'dict'

Work dir:
  /data/ceph/hdd/project/node_05/ghga-rnaseq-harmonization/work/27/a1c0626cacdcc33a9ea7c9a00f6b3f

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '.nextflow.log' file for details
drpatelh commented 7 months ago

The CUSTOM_DUMPSOFTWAREVERSIONS was removed in v1.11.0 of the pipeline so this should no longer be an issue.

See https://github.com/nf-core/fetchngs/pull/226