bigbio / pmultiqc

A library for QC report based on MultiQC framework
GNU General Public License v3.0
13 stars 8 forks source link

Errors with mzTab when decoy field now availabe #69

Closed ypriverol closed 2 years ago

ypriverol commented 2 years ago

I got the following error:

Error executing process > 'NFCORE_QUANTMS:QUANTMS:SUMMARYPIPELINE (1)'

Caused by:
  Process `NFCORE_QUANTMS:QUANTMS:SUMMARYPIPELINE (1)` terminated with an error exit status (1)

Command executed:

  multiqc \
      -f \
      --config ./results/multiqc_config.yml \
       \
       \
       \
      --quantification_method spectral_counting \
      ./results \
      -o .

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_QUANTMS:QUANTMS:SUMMARYPIPELINE":
      pmultiqc: $(multiqc --pmultiqc_version | sed -e "s/pmultiqc, version //g")
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  ╰──────────────────────────────────────────────────────────────────────────────╯
  DEBUG:multiqc:Oops! The 'quantms' MultiQC module broke...
  ================================================================================
  Traceback (most recent call last):
    File "/usr/local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3621, in get_loc
      return self._engine.get_loc(casted_key)
    File "pandas/_libs/index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc
    File "pandas/_libs/index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc
    File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
    File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
  KeyError: 'opt_global_cv_MS:1002217_decoy_peptide'

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "/usr/local/lib/python3.9/site-packages/multiqc/multiqc.py", line 651, in run
      output = mod()
    File "/usr/local/lib/python3.9/site-packages/pmultiqc/modules/quantms/quantms.py", line 137, in __init__
      self.parse_out_mzTab()
    File "/usr/local/lib/python3.9/site-packages/pmultiqc/modules/quantms/quantms.py", line 1211, in parse_out_mzTab
      psm = psm[psm['opt_global_cv_MS:1002217_decoy_peptide'] != 1]
    File "/usr/local/lib/python3.9/site-packages/pandas/core/frame.py", line 3505, in __getitem__
      indexer = self.columns.get_loc(key)
    File "/usr/local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3623, in get_loc
      raise KeyError(key) from err
  KeyError: 'Error executing process > 'NFCORE_QUANTMS:QUANTMS:SUMMARYPIPELINE (1)'

Caused by:
  Process `NFCORE_QUANTMS:QUANTMS:SUMMARYPIPELINE (1)` terminated with an error exit status (1)

Command executed:

  multiqc \
      -f \
      --config ./results/multiqc_config.yml \
       \
       \
       \
      --quantification_method spectral_counting \
      ./results \
      -o .

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_QUANTMS:QUANTMS:SUMMARYPIPELINE":
      pmultiqc: $(multiqc --pmultiqc_version | sed -e "s/pmultiqc, version //g")
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  ╰──────────────────────────────────────────────────────────────────────────────╯
  DEBUG:multiqc:Oops! The 'quantms' MultiQC module broke...
  ================================================================================
  Traceback (most recent call last):
    File "/usr/local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3621, in get_loc
      return self._engine.get_loc(casted_key)
    File "pandas/_libs/index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc
    File "pandas/_libs/index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc
    File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
    File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
  KeyError: 'opt_global_cv_MS:1002217_decoy_peptide'

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "/usr/local/lib/python3.9/site-packages/multiqc/multiqc.py", line 651, in run
      output = mod()
    File "/usr/local/lib/python3.9/site-packages/pmultiqc/modules/quantms/quantms.py", line 137, in __init__
      self.parse_out_mzTab()
    File "/usr/local/lib/python3.9/site-packages/pmultiqc/modules/quantms/quantms.py", line 1211, in parse_out_mzTab
      psm = psm[psm['opt_global_cv_MS:1002217_decoy_peptide'] != 1]
    File "/usr/local/lib/python3.9/site-packages/pandas/core/frame.py", line 3505, in __getitem__
      indexer = self.columns.get_loc(key)
    File "/usr/local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3623, in get_loc
      raise KeyError(key) from err
  KeyError: 'opt_global_cv_MS:1002217_decoy_peptide'
  ================================================================================
  |    custom_content | software_versions: Found 1 sample (html)
  INFO:multiqc.modules.custom_content.custom_content:software_versions: Found 1 sample (html)
  |    custom_content | nf-core-quantms-summary: Found 1 sample (html)
  INFO:multiqc.modules.custom_content.custom_content:nf-core-quantms-summary: Found 1 sample (html)
  DEBUG:multiqc:Reordering sections: anchor 'pmultiqc' not found.
  DEBUG:multiqc:Reordering sections: anchor 'pmultiqc' not found for module 'nf-core/quantms Software Versions'.
  DEBUG:multiqc:Reordering sections: anchor 'software_versions' not found for module 'nf-core/quantms Software Versions'.
  DEBUG:multiqc:Reordering sections: anchor 'nf-core-quantms-summary' not found for module 'nf-core/quantms Software Versions'.
  DEBUG:multiqc:Reordering sections: anchor 'pmultiqc' not found for module 'nf-core/quantms Workflow Summary'.
  DEBUG:multiqc:Reordering sections: anchor 'software_versions' not found for module 'nf-core/quantms Workflow Summary'.
  DEBUG:multiqc:Reordering sections: anchor 'nf-core-quantms-summary' not found for module 'nf-core/quantms Workflow Summary'.
  |           multiqc | Compressing plot data
  INFO:multiqc:Compressing plot data
  |           multiqc | Report      : multiqc_report.html
  INFO:multiqc:Report      : multiqc_report.html
  |           multiqc | Data        : multiqc_data
  INFO:multiqc:Data        : multiqc_data
  DEBUG:multiqc:Moving data file from '/hps/scratch/lsf_tmpdir/hl-codon-129-03/tmp0vmnod6_/multiqc_data' to './multiqc_data'
  |           multiqc | Plots       : multiqc_plots
  INFO:multiqc:Plots       : multiqc_plots
  DEBUG:multiqc:Moving plots directory from '/hps/scratch/lsf_tmpdir/hl-codon-129-03/tmp0vmnod6_/multiqc_plots' to './multiqc_plots'
  |           multiqc | MultiQC complete
  INFO:multiqc:MultiQC complete

Work dir:
  /hps/nobackup/juan/pride/reanalysis/phospho-datasets/PXD005173/work/6f/2e777ff6ab47e68944a1e599d4386b

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`'
  ================================================================================
  |    custom_content | software_versions: Found 1 sample (html)
  INFO:multiqc.modules.custom_content.custom_content:software_versions: Found 1 sample (html)
  |    custom_content | nf-core-quantms-summary: Found 1 sample (html)
  INFO:multiqc.modules.custom_content.custom_content:nf-core-quantms-summary: Found 1 sample (html)
  DEBUG:multiqc:Reordering sections: anchor 'pmultiqc' not found.
  DEBUG:multiqc:Reordering sections: anchor 'pmultiqc' not found for module 'nf-core/quantms Software Versions'.
  DEBUG:multiqc:Reordering sections: anchor 'software_versions' not found for module 'nf-core/quantms Software Versions'.
  DEBUG:multiqc:Reordering sections: anchor 'nf-core-quantms-summary' not found for module 'nf-core/quantms Software Versions'.
  DEBUG:multiqc:Reordering sections: anchor 'pmultiqc' not found for module 'nf-core/quantms Workflow Summary'.
  DEBUG:multiqc:Reordering sections: anchor 'software_versions' not found for module 'nf-core/quantms Workflow Summary'.
  DEBUG:multiqc:Reordering sections: anchor 'nf-core-quantms-summary' not found for module 'nf-core/quantms Workflow Summary'.
  |           multiqc | Compressing plot data
  INFO:multiqc:Compressing plot data
  |           multiqc | Report      : multiqc_report.html
  INFO:multiqc:Report      : multiqc_report.html
  |           multiqc | Data        : multiqc_data
  INFO:multiqc:Data        : multiqc_data
  DEBUG:multiqc:Moving data file from '/hps/scratch/lsf_tmpdir/hl-codon-129-03/tmp0vmnod6_/multiqc_data' to './multiqc_data'
  |           multiqc | Plots       : multiqc_plots
  INFO:multiqc:Plots       : multiqc_plots
  DEBUG:multiqc:Moving plots directory from '/hps/scratch/lsf_tmpdir/hl-codon-129-03/tmp0vmnod6_/multiqc_plots' to './multiqc_plots'
  |           multiqc | MultiQC complete
  INFO:multiqc:MultiQC complete

Work dir:
  /hps/nobackup/juan/pride/reanalysis/phospho-datasets/PXD005173/work/6f/2e777ff6ab47e68944a1e599d4386b

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

This is because in some versions of mzTab export from OpenMS the following field may not be avilable opt_global_cv_MS:1002217_decoy_peptide I suggest we actually handle this in the following way.

First by default define as DECOY peptide anything that in the accession of the protein has the following prefix DECOY_ . Then if opt_global_cv_MS:1002217_decoy_peptide is not available try to parse the protein accession and found if in any protein that prefix is found.