bigbio / pmultiqc

A library for QC report based on MultiQC framework
GNU General Public License v3.0
13 stars 9 forks source link

plug the package in proteomicsLFQ pipeline #24

Closed ypriverol closed 9 months ago

ypriverol commented 3 years ago

Tasks:

jpfeuffer commented 3 years ago

I tried to incorporate the package but it fails:

Command executed:

  multiqc \
    --exp_design experimental_design.tsv \
    --mzMLs ./mzMLs \
    --raw_ids ./raw_ids \
    ./proteomicslfq \
    -o .

Command exit status:
  1

Command output:
  Searching   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 1/1  

Command error:
  [INFO   ]     custom_code : Running pmultiqc Plugin v0.0.4
  [INFO   ]         multiqc : This is MultiQC v1.10.1
  [INFO   ]         multiqc : Template    : default
  [INFO   ]         multiqc : Searching   : proteomicslfq
  Matplotlib created a temporary config/cache directory at /tmp/matplotlib-jmuf35lv because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
  Parsing out csv file...
  [ERROR  ]         multiqc : Oops! The 'proteomicslfq' MultiQC module broke... 
    Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues 
    If possible, please include a log file that triggers the error - the last file found was:
      None
  ============================================================
  Module proteomicslfq raised an exception: Traceback (most recent call last):
    File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-packages/multiqc/multiqc.py", line 594, in run
      output = mod()
    File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-packages/pmultiqc/modules/proteomicslfq/proteomicslfq.py", line 92, in __init__
      self.parse_out_csv()
    File "/opt/conda/envs/nf-core-proteomicslfq-1.1.0dev/lib/python3.9/site-packages/pmultiqc/modules/proteomicslfq/proteomicslfq.py", line 773, in parse_out_csv
      data = pd.read_csv(self.out_csv_path, sep=',', header=0)
  AttributeError: 'MultiqcModule' object has no attribute 'out_csv_path'
  ============================================================
Warning: G]         multiqc : No analysis results found. Cleaning up..
  [INFO   ]         multiqc : MultiQC complete

https://github.com/nf-core/proteomicslfq/pull/149/checks

veitveit commented 3 years ago

This error is because pmultiqc looks for _outmsstats.csv and not out.csv in the proteomics_lfq folder.

But I still get another more serious one when applying the tool on the ProteomicsLFQ output from PXD001819 (UPS data set)

[WARNING]         multiqc : MultiQC Version v1.10.1 now available!
[INFO   ]     custom_code : Running pmultiqc Plugin v0.0.5
[INFO   ]         multiqc : This is MultiQC v1.6
[INFO   ]         multiqc : Template    : default
[INFO   ]         multiqc : Searching 'proteomics_lfq/'
Searching 5 files..  [####################################]  100%
[WARNING]     base_module : Depreciation Warning: ProteomicsLFQ - Please use new style for find_log_files()
[WARNING]     base_module : Depreciation Warning: ProteomicsLFQ - Please use new style for find_log_files()
Calculate Heatmap Score
Parsing out csv file...
[ERROR  ]         multiqc : Oops! The 'proteomicslfq' MultiQC module broke... 
  Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues 
  If possible, please include a log file that triggers the error - the last file found was:
    proteomics_lfq/out.mzTab
============================================================
Module proteomicslfq raised an exception: Traceback (most recent call last):
  File "/home/veit/anaconda3/bin/multiqc", line 440, in multiqc
    output = mod()
  File "/home/veit/anaconda3/lib/python3.6/site-packages/pmultiqc/modules/proteomicslfq/proteomicslfq.py", line 102, in __init__
    self.parse_out_csv()
  File "/home/veit/anaconda3/lib/python3.6/site-packages/pmultiqc/modules/proteomicslfq/proteomicslfq.py", line 938, in parse_out_csv
    ProteinNames = data[data[data['Reference'] == i]['PeptideSequence'] == peptides[i]][
TypeError: list indices must be integers or slices, not str
============================================================
[WARNING]         multiqc : No analysis results found. Cleaning up..
[INFO   ]         multiqc : MultiQC complete
daichengxin commented 3 years ago

Thanks, I found

veitveit commented 3 years ago

Great!

The 2nd error is when running multiqc with the experimental design file and not the sdrf file.

I got some additional errors complaining about a missing decoy column. So calling this information could be combined with an if statement as this run did not contain decoys.

Actually, proteomicslfq does not add decoy hits to the database by default. I am now running with decoys.

jpfeuffer commented 2 years ago

@ypriverol this should be fixed by now, right?