MultiQC / MultiQC

Aggregate results from bioinformatics analyses across many samples into a single report.
http://multiqc.info
GNU General Public License v3.0
1.22k stars 602 forks source link

Module VEP broke #1597

Closed quentin67100 closed 2 years ago

quentin67100 commented 2 years ago

Description of bug

I tried to use MultiQC with singularity. But it failled to add VEP stats. I also tried with the conda version of multiqc but it failled too.

File that triggers the error

Pologne.stat.vep.html.zip ,

MultiQC Error log

/// MultiQC 🔍 | v1.11

|           multiqc | Search path : /shared/projects/gentaumix/Pologne
|          qualimap | Found 4 BamQC reports
|    MarkDuplicates | Skipping MarkDuplicates sample 'multiqc_data' as missing essential fields
|    MarkDuplicates | Skipping MarkDuplicates sample 'multiqc_data' as missing essential fields
|    MarkDuplicates | Skipping MarkDuplicates sample 'multiqc_data' as missing essential fields
|    MarkDuplicates | Skipping MarkDuplicates sample 'multiqc_data' as missing essential fields
|    MarkDuplicates | Skipping MarkDuplicates sample 'multiqc_data' as missing essential fields
|            picard | Found 4 MarkDuplicates reports
|            picard | Found 8 ValidateSamFile reports
╭────────────────── Oops! The 'vep' MultiQC module broke... ───────────────────╮
│ Please copy this log and report it at                                        │
│ https://github.com/ewels/MultiQC/issues                                      │
│ Please attach a file that triggers the error. The last file found was:       │
│ /shared/projects/gentaumix/Pologne//06_VEP/Pologne.stat.vep.html             │
│                                                                              │
│ Traceback (most recent call last):                                           │
│   File "/usr/lib/python3.8/site-packages/multiqc/multiqc.py", line 624, in r │
│     output = mod()                                                           │
│   File "/usr/lib/python3.8/site-packages/multiqc/modules/vep/vep.py", line 3 │
│     self.parse_vep_html(f)                                                   │
│   File "/usr/lib/python3.8/site-packages/multiqc/modules/vep/vep.py", line 1 │
│     existing = values[1].split("(")[0].replace(" ", "")                      │
│ IndexError: list index out of range                                          │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯
|      fastq_screen | Found 8 reports
|            fastqc | Found 8 reports
|           multiqc | Compressing plot data
|           multiqc | Report      : ../../../../../projects/gentaumix/Pologne/Pologne_MultiQC_global.html
|           multiqc | Data        : ../../../../../projects/gentaumix/Pologne/Pologne_MultiQC_global_data
|           multiqc | MultiQC complete
ewels commented 2 years ago

@maxulysse - are you interested in taking a look at this?

project-defiant commented 2 years ago

I had similar issue with v.1.11 from bioconda but with the output from annotating empty vcf file

================================================================================
Traceback (most recent call last):
  File "/home/szszyszkowski/.local/lib/python3.8/site-packages/multiqc/multiqc.py", line 651, in run
    output = mod()
  File "/home/szszyszkowski/.local/lib/python3.8/site-packages/multiqc/modules/vep/vep.py", line 37, in __init__
    self.parse_vep_html(f)
  File "/home/szszyszkowski/.local/lib/python3.8/site-packages/multiqc/modules/vep/vep.py", line 131, in parse_vep_html
    existing = values[1].split("(")[0].replace(" ", "")
IndexError: list index out of range

Similar output is produced if using *.vep.txt files as input. I expect it to be possibly due to lack of some values in the [General Statistics] section

...
[General statistics]
Lines of input read
Variants processed
Variants filtered out   0
Novel / existing variants   -
Overlapped genes    0
Overlapped transcripts  0
Overlapped regulatory features  -
...
maxulysse commented 2 years ago

@ewels I'd be happy to have a look, sorry for the late reply

ewels commented 2 years ago

Hi all,

This should now be fixed in 1977df64c13823751853e3b8e3246b1059abf16d - table cells containing - are not reported and will be empty in the MultiQC table now (the entire column will be missing if all samples had this). The module should no longer crash.

Thanks for reporting and providing an example report 👍🏻

Phil