epi2me-labs / wf-human-variation

Other
101 stars 43 forks source link

Huge file size for structural variant report - crashes app #83

Closed LisaHagenau closed 1 year ago

LisaHagenau commented 1 year ago

Operating System

Other Linux (please specify below)

Other Linux

Ubuntu 20.04.5 LTS

Workflow Version

v1.7.2

Workflow Execution

EPI2ME Desktop application

EPI2ME Version

v5.1.1

CLI command run

No response

Workflow Execution - CLI Execution Profile

None

What happened?

Hello,

I had no trouble running the workflow (SNP, SV and methylation calling), it completed successfully, but when I wanted to check the reports, the app crashed. This happened on both the newest and the last version (1.7.1) of the workflow. I updated the Epi2me app as well, but same results. I was able to open the reports in the browser from the output folder, but I noticed that the SV report was very large compared to the other report files (433.7 MB vs 3.1 MB for the SNP report).

I then ran the workflow just for the SV calling with a smaller bam file as input. Now the app doesn't crash anymore when changing to the report tab, but the reports still don't load and the SV report is 4.4 GB.

Do you have an idea of what is causing this?

Relevant log output

N E X T F L O W  ~  version 23.04.2
Launching `/home/nanopore/data-hdd/epi2melabs/workflows/epi2me-labs/wf-human-variation/main.nf` [test-SV] DSL2 - revision: 39803fa308
||||||||||   _____ ____ ___ ____  __  __ _____      _       _
||||||||||  | ____|  _ \_ _|___ \|  \/  | ____|    | | __ _| |__  ___
|||||       |  _| | |_) | |  __) | |\/| |  _| _____| |/ _` | '_ \/ __|
|||||       | |___|  __/| | / __/| |  | | |__|_____| | (_| | |_) \__ \
||||||||||  |_____|_|  |___|_____|_|  |_|_____|    |_|\__,_|_.__/|___/
||||||||||  wf-human-variation v1.7.2
--------------------------------------------------------------------------------
Core Nextflow options
  runName         : test-SV
  containerEngine : docker
  container       : ontresearch/wf-human-variation:sha1b503961726c6e02b6b908297a9797db953b46a3
  launchDir       : /mnt/data-hdd/epi2melabs/instances/wf-human-variation_28755a11-a878-4fab-83f2-ff76340f9d81
  workDir         : /home/nanopore/data-hdd/epi2melabs/instances/wf-human-variation_28755a11-a878-4fab-83f2-ff76340f9d81/work
  projectDir      : /home/nanopore/data-hdd/epi2melabs/workflows/epi2me-labs/wf-human-variation
  userName        : nanopore
  profile         : standard
  configFiles     : /home/nanopore/data-hdd/epi2melabs/workflows/epi2me-labs/wf-human-variation/nextflow.config
Workflow Options
  sv              : true
Main options
  sample_name     : test-SV
  bam             : /home/nanopore/data-hdd/epi2melabs/instances/wf-human-variation_51c6805c-f1db-42b2-b938-d50f49501e21/output/PRO003_Jkt-Ph.pass.bam
  ref             : /home/nanopore/data-hdd/genomes/hg38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz
  basecaller_cfg  : dna_r10.4.1_e8.2_400bps_sup@v4.2.0
  bam_min_coverage: 4
  out_dir         : /home/nanopore/data-hdd/epi2melabs/instances/wf-human-variation_28755a11-a878-4fab-83f2-ff76340f9d81/output
!! Only displaying parameters that differ from the pipeline defaults !!
--------------------------------------------------------------------------------
If you use epi2me-labs/wf-human-variation for your analysis please cite:
* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x
--------------------------------------------------------------------------------
This is epi2me-labs/wf-human-variation v1.7.2.
--------------------------------------------------------------------------------
[c7/ad57a6] Submitted process > publish_artifact (1)
[94/50c973] Submitted process > decompress_ref
[3a/e2b717] Submitted process > sv:runReport:getParams
[1f/3d9531] Submitted process > getParams
[92/38bbb8] Submitted process > sv:runReport:getVersions
[5a/717372] Submitted process > getVersions
[55/4ecdfc] Submitted process > index_ref_fai
[d8/9525c9] Submitted process > cram_cache
[26/ebd68b] Submitted process > bam_ingress:check_for_alignment (1)
[2b/9713b5] Submitted process > getGenome (1)
[54/176098] Submitted process > getAllChromosomesBed (1)
[5a/ccfa11] Submitted process > publish_artifact (2)
[cb/b7bbcd] Submitted process > publish_artifact (3)
[2f/8a840f] Submitted process > publish_artifact (4)
[d0/623e04] Submitted process > configure_jbrowse (1)
[ba/4d4080] Submitted process > publish_artifact (5)
[24/463dbe] Submitted process > readStats (1)
[c3/deb1a5] Submitted process > mosdepth_input (1)
[e4/5c5162] Submitted process > publish_artifact (6)
[e9/5cc29b] Submitted process > publish_artifact (8)
[02/020476] Submitted process > publish_artifact (9)
[b9/a33925] Submitted process > publish_artifact (7)
[47/f98eae] Submitted process > publish_artifact (10)
[89/c62496] Submitted process > get_coverage (1)
[b5/1cea60] Submitted process > sv:variantCall:filterBam (1)
[dd/e41f58] Submitted process > makeAlignmentReport
[51/01811a] Submitted process > sv:variantCall:sniffles2 (1)
[7c/80efc5] Submitted process > publish_artifact (11)
[09/02b40d] Submitted process > sv:variantCall:filterCalls (1)
[19/6dc217] Submitted process > sv:variantCall:sortVCF (1)
[f0/cba47e] Submitted process > sv:variantCall:indexVCF (1)
[f9/6d21e2] Submitted process > sv:annotate_sv_vcf (1)
[f9/7a4765] Submitted process > sv:runReport:report (1)
[f3/6ed299] Submitted process > output_sv (4)
[cb/c39cff] Submitted process > output_sv (3)
[6a/8498c5] Submitted process > output_sv (5)
[da/cedc8c] Submitted process > output_sv (1)
[65/3a9252] Submitted process > output_sv (2)

Application activity log entry

No response

SamStudio8 commented 1 year ago

Hi @LisaHagenau, are you able to inspect the HTML report with a text viewer or terminal to try and diagnose this? I would expect to see a repetitive structure of some sort of data. My gut feeling is there may be a lot of annotations embedded in the report. It might be helpful to try --annotation false and see if the error remains.

LisaHagenau commented 1 year ago

@SamStudio8 I don't think it's the annotations. I have been running the same workflow as above without annotation and it has been stuck at the sv:runReport:report process for over an hour. I couldn't find any annotations in the html file, but there is at least one very long line. The maximum line length seems to correspond with the file size.

For different file sizes: 930131 Jkt-Ph.wf-human-snp-report.html (3.1 MB) 429090249 Jkt-Ph.wf-human-sv-report.html (433 MB) 4404170146 test-SV.wf-human-sv-report.html (3.3 GB)

The line starts with var opt_EZChart_dd9ed8406162403cb9c106e12581ab20 = {'title': {'text': 'Deletion size distribution'}, '

The deletion size distribution is actually not visible when I open the report in the browser. I think this is probably the culprit. Here is the 433 MB report, if you want to take a look: https://nextcloud.uni-greifswald.de/index.php/s/2KLibxN7JQB7Bdr

SamStudio8 commented 1 year ago

@LisaHagenau Thanks for this!

SamStudio8 commented 1 year ago

@LisaHagenau We've confirmed this as a defect, I've raised a ticket internally to try and have this fixed for our next release.

LisaHagenau commented 1 year ago

@SamStudio8 Great, thank you!

SamStudio8 commented 1 year ago

@LisaHagenau This should be fixed in v1.8.1, please let us know how you get on.

LisaHagenau commented 1 year ago

@SamStudio8 It's working! Thank you for the quick fix.