Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
453 stars 152 forks source link

out-of-memory error annotating one SV with `--custom` bigwig #1414

Closed dennishendriksen closed 1 year ago

dennishendriksen commented 1 year ago

Hi ensembl-vep team,

VEP (109.3 docker, 109 cache) process gets killed due to an out of memory error:

Unknown option: no_plugins
Ignoring unsupported option 'no_plugins' found via ENV variable or INI file
Unknown option: no_update
Ignoring unsupported option 'no_update' found via ENV variable or INI file
Unknown option: pluginsdir
Ignoring unsupported option 'pluginsdir' found via ENV variable or INI file
Unknown option: no_htslib
Ignoring unsupported option 'no_htslib' found via ENV variable or INI file
WARNING: variant . on line 1 is too long to annotate: (76296407)

-------------------- EXCEPTION --------------------
MSG:
ERROR: Forked process(es) died: read-through of cross-process communication detected

STACK Bio::EnsEMBL::VEP::Runner::_forked_buffer_to_output /opt/vep/src/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:561
STACK Bio::EnsEMBL::VEP::Runner::next_output_line /opt/vep/src/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:366
STACK Bio::EnsEMBL::VEP::Runner::run /opt/vep/src/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:207
STACK toplevel /opt/vep/src/ensembl-vep/vep:46
Date (localtime)    = Thu May 11 17:07:16 2023
Ensembl API version = 109
---------------------------------------------------
slurmstepd: error: Detected 1 oom-kill event(s) in StepId=72039.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.

It took me quite a while but I've managed to reproduce the issue using a single variant:

##fileformat=VCFv4.2
##contig=<ID=chr2,length=242193529>
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=SVLEN,Number=.,Type=Integer,Description="Difference in length between REF and ALT alleles">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##ALT=<ID=DUP:TANDEM,Description="Tandem Duplication">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
chr2    28983075        .       C       <DUP:TANDEM>    .       .       END=105279483;SVTYPE=DUP;SVLEN=76296408

The VEP command:

  local args=()
  args+=("--input_file" "one_record.vcf.gz")
  args+=("--format" "vcf")
  args+=("--output_file" "${vcfOutputPath}")
  args+=("--vcf")
  args+=("--compress_output" "bgzip")
  args+=("--no_stats")
  args+=("--fasta" "GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz")
  args+=("--offline")
  args+=("--cache")
  args+=("--dir_cache" "vep/cache")
  args+=("--species" "homo_sapiens")
  args+=("--assembly" "GRCh38")
  args+=("--refseq")
  args+=("--exclude_predicted")
  args+=("--use_given_ref")
  args+=("--symbol")
  args+=("--flag_pick_allele")
  args+=("--sift" "s")
  args+=("--polyphen" "s")
  args+=("--total_length")
  args+=("--shift_3prime" "1")
  args+=("--allele_number")
  args+=("--numbers")
  args+=("--dont_skip")
  args+=("--allow_non_variant")
  args+=("--buffer_size" "1")
  args+=("--fork" "1")
  args+=("--dir_plugins" "vep/plugins")
  args+=("--custom" "GRCh38/hg38.phyloP100way.bw,phyloP,bigwig,exact,0") 

  ${CMD_VEP} "${args[@]}"

The error occurs even when running VEP with 32GB of memory.

After disabling the --custom argument the process finishes successfully.

Annotating with:

args+=("--custom" "GRCh38/clinvar_20230115.vcf.gz,clinVar,vcf,exact,0,CLNSIG,CLNSIGINCL,CLNREVSTAT")

also causes the process to finish successfully.

VEP not being able to annotate variants that are too long appears like correct behavior to me, but eating all available memory and getting killed by the operating system due to an out-of-memory error does not. My suspission is that the combination of a long SV + --custom + bigwig is this issue.

Download link for bigwig file.

Could you look into this issue? Currently I have multiple analysis failing with the same error.

Thank you in advance!

nuno-agostinho commented 1 year ago

Hi @dennishendriksen,

Thanks for reporting this issue and giving a reproducible example. We will take a look into this and update you when we have more information.

Best regards, Nuno

davmlaw commented 1 year ago

It's likely that the original poster could do with summary statistic (ie min/max/mean) for his SV instead of 76,296,408 floating point numbers joined with "&" - as proposed in #1023 so if you allowed summary stats and calculated them as you go (rolling/moving average) this would likely fix this issue as well

nuno-agostinho commented 1 year ago

Hi @davmlaw, thanks for the suggestion. Based on #1023, we are thinking of maybe showing some summary statistics by default for any variant whose length is 50 bps or more (i.e., structural variants).

@dennishendriksen, do you think that showing summary statistics as a default for large variants would be a good compromise or are you interested in getting all the individual values in this specific case?

nuno-agostinho commented 1 year ago

Hi all!

After re-reading, the issue is rather that we are annotating a variant that should be skipped based on --max_sv_size (it returns nothing because of no exact match in the custom annotation, but it is still trying to parse the annotation). I opened #1473 to avoid looking for the custom annotation of variants that should be skipped.

In the meantime, I am still investigating if there are some further improvements we can make related with annotating large variants with bigwig files (besides adding summary statistics in #1470).

Thanks, Nuno

nuno-agostinho commented 1 year ago

Hey @dennishendriksen,

I think this is now fixed with #1473, where we avoid annotation of skipped variants. This fix will be available in the next release of VEP.

I will close this issue for now, but do tell me if you are facing other issues. Thanks for your report!

Cheers, Nuno