bigbio / quantms

Quantitative mass spectrometry workflow. Currently supports proteomics experiments with complex experimental designs for DDA-LFQ, DDA-Isobaric and DIA-LFQ quantification.
https://quantms.org
MIT License
32 stars 35 forks source link

DIA-NN convert to mzTab failing #333

Closed ypriverol closed 9 months ago

ypriverol commented 10 months ago

Description of the bug

ERROR ~ Error executing process > 'NFCORE_QUANTMS:QUANTMS:DIA:DIANNCONVERT (PXD039023.sdrf)'

Caused by:
  Process `NFCORE_QUANTMS:QUANTMS:DIA:DIANNCONVERT (PXD039023.sdrf)` terminated with an error exit status (1)

Command executed:

  diann_convert.py convert \
      --folder ./ \
      --exp_design PXD039023.sdrf_openms_design.tsv \
      --diann_version ./version/versions.yml \
      --dia_params "20.0;ppm;12.0;ppm;Trypsin;Carbamidomethyl (C);Acetyl (Protein N-term),Oxidation (M)" \
      --charge 4 \
      --missed_cleavages 1 \
      --qvalue_threshold 0.01 \
      2>&1 | tee convert_report.log

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_QUANTMS:QUANTMS:DIA:DIANNCONVERT":
      pyopenms: $(pip show pyopenms | grep "Version" | awk -F ': ' '{print $2}')
  END_VERSIONS

Command exit status:
  1

Command output:
  2023-12-25 13:21:00,912 [mztab_PRH] - Matching PRH to modifications...
  2023-12-25 13:21:00,917 [mztab_PRH] - Matching PRH to protein quantification...
  /hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py:652: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'null' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
    out_mztab_PRH.fillna("null", inplace=True)
  2023-12-25 13:21:02,002 [mztab_PEH] - Constructing PEH sub-table...
  2023-12-25 13:21:02,002 [mztab_PEH] - report.shape: (2621360, 23),  pr.shape: (2693, 889), len(precursor_list): 3335, index_ref.shape: (879, 6)
  2023-12-25 13:21:02,003 [mztab_PEH] - Finding modifications...
  2023-12-25 13:21:02,016 [mztab_PEH] - Extracting sequence...
  2023-12-25 13:21:02,028 [mztab_PEH] - Checking accession uniqueness...
  2023-12-25 13:21:02,037 [mztab_PEH] - Matching precursor IDs...
  2023-12-25 13:21:02,039 [mztab_PEH] - Getting scores per run
  2023-12-25 13:21:02,857 [mztab_PEH] - Getting peptide abundances per study variable
  2023-12-25 13:21:04,214 [mztab_PEH] - Getting peptide properties...
  2023-12-25 13:21:04,356 [mztab_PEH] - Re-ordering columns...
  /hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py:799: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'null' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
    out_mztab_PEH.fillna("null", inplace=True)
  2023-12-25 13:21:05,581 [mztab_PSH] - Constructing PSH sub-table
  Warning: OPENMS_DATA_PATH environment variable not found and no share directory was installed. Some functionality might not work as expected.
  Traceback (most recent call last):
    File "/hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py", line 1331, in <module>
      cli()
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
      return self.main(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1078, in main
      rv = self.invoke(ctx)
           ^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
      return _process_result(sub_ctx.command.invoke(sub_ctx))
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
      return ctx.invoke(self.callback, **ctx.params)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 783, in invoke
      return __callback(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
      return f(get_current_context(), *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py", line 144, in convert
      diann_directory.convert_to_mztab(
    File "/hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py", line 307, in convert_to_mztab
      PSH = mztab_PSH(report, str(self.base_path), database)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py", line 845, in mztab_PSH
      file = __find_info(folder, n)
             ^^^^^^^^^^^^^^^^^^^^^^
    File "/hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py", line 833, in __find_info
      raise ValueError(f"Found multiple {n} info files in {directory}: {files}")
  ValueError: Found multiple 10_batch1 info files in .: [PosixPath('10_batch1_ms_info.tsv'), PosixPath('110_batch1_ms_info.tsv'), PosixPath('210_batch1_ms_info.tsv'), PosixPath('310_batch1_ms_info.tsv')]

Command wrapper:
  2023-12-25 13:21:00,912 [mztab_PRH] - Matching PRH to modifications...
  2023-12-25 13:21:00,917 [mztab_PRH] - Matching PRH to protein quantification...
  /hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py:652: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'null' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
    out_mztab_PRH.fillna("null", inplace=True)
  2023-12-25 13:21:02,002 [mztab_PEH] - Constructing PEH sub-table...
  2023-12-25 13:21:02,002 [mztab_PEH] - report.shape: (2621360, 23),  pr.shape: (2693, 889), len(precursor_list): 3335, index_ref.shape: (879, 6)
  2023-12-25 13:21:02,003 [mztab_PEH] - Finding modifications...
  2023-12-25 13:21:02,016 [mztab_PEH] - Extracting sequence...
  2023-12-25 13:21:02,028 [mztab_PEH] - Checking accession uniqueness...
  2023-12-25 13:21:02,037 [mztab_PEH] - Matching precursor IDs...
  2023-12-25 13:21:02,039 [mztab_PEH] - Getting scores per run
  2023-12-25 13:21:02,857 [mztab_PEH] - Getting peptide abundances per study variable
  2023-12-25 13:21:04,214 [mztab_PEH] - Getting peptide properties...
  2023-12-25 13:21:04,356 [mztab_PEH] - Re-ordering columns...
  /hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py:799: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'null' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
    out_mztab_PEH.fillna("null", inplace=True)
  2023-12-25 13:21:05,581 [mztab_PSH] - Constructing PSH sub-table
  Warning: OPENMS_DATA_PATH environment variable not found and no share directory was installed. Some functionality might not work as expected.
  Traceback (most recent call last):
    File "/hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py", line 1331, in <module>
      cli()
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
      return self.main(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1078, in main
      rv = self.invoke(ctx)
           ^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
      return _process_result(sub_ctx.command.invoke(sub_ctx))
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
      return ctx.invoke(self.callback, **ctx.params)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 783, in invoke
      return __callback(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
      return f(get_current_context(), *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py", line 144, in convert
      diann_directory.convert_to_mztab(
    File "/hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py", line 307, in convert_to_mztab
      PSH = mztab_PSH(report, str(self.base_path), database)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py", line 845, in mztab_PSH
      file = __find_info(folder, n)
             ^^^^^^^^^^^^^^^^^^^^^^
    File "/hps/nobackup/juan/pride/reanalysis/quantms/bin/diann_convert.py", line 833, in __find_info
      raise ValueError(f"Found multiple {n} info files in {directory}: {files}")
  ValueError: Found multiple 10_batch1 info files in .: [PosixPath('10_batch1_ms_info.tsv'), PosixPath('110_batch1_ms_info.tsv'), PosixPath('210_batch1_ms_info.tsv'), PosixPath('310_batch1_ms_info.tsv')]

Work dir:
  /hps/nobackup/juan/pride/reanalysis/absolute-expression/platelet/PXD039023/work/a4/8801f37a6e135b21be97629fb0e07d

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '.nextflow.log' file for details

-[nf-core/quantms] Sent summary e-mail to yperez@ebi.ac.uk (sendmail)-
-[nf-core/quantms] Pipeline completed with errors-
ERROR ~ Unexpected error [NullPointerException]

 -- Check '.nextflow.log' file for details

Command used and terminal output

No response

Relevant files

No response

System information

No response

ypriverol commented 10 months ago

Is this line: https://github.com/bigbio/quantms/blob/af45e327aeec6aa46d3704d95842c58a2d0052f8/bin/diann_convert.py#L828