nf-cmgg / structural

A bioinformatics best-practice analysis pipeline for calling structural variants (SVs), copy number variants (CNVs) and repeat region expansions (RREs) from short DNA reads
https://nf-cmgg.github.io/structural/
MIT License
18 stars 3 forks source link

AnnotSV issue #68

Closed mvheetve closed 6 months ago

mvheetve commented 8 months ago

Description of the bug

Encountered an issue during annotation related to AnnotSV. From what I understand it is a known bug for 64bit machines and is caused by Tcl and the allocated memory block limit. The workarounds proposed here require version changes of certain dependancies. If this is at an operating system level, maybe we need to drag the HPC crew into this.

All data, including jobscript, input, nextflow.log, stderror and stdout can be found at $VSC_DATA_VO/research/ICT/VAL/VAL_batch1. Work directory is $VSC_SCRATCH_VO/gvo00082/vsc43079/VAL_batch1/structural, let me know if you need access.

Regards M

Command used and terminal output

Input:

nextflow \
    -log ${OUTDIR}/.nextflow.log \
    run CenterForMedicalGeneticsGhent/nf-cmgg-structural \
    -r proper-testing \
    -work-dir ${WORKDIR} \
    --input ${samplesheet} \
    --outdir ${OUTDIR} \
    -profile vsc_ugent,$SLURM_CLUSTERS \
    --genomes_base ${genomes_base} \
    -resume \
    -latest \
    -c /kyukon/data/gent/vo/000/gvo00082/research/ICT/VAL/vep_cache.config \
    --callers manta,delly,smoove,qdnaseq,wisecondorx,expansionhunter \
    --output_callers \
    --igenomes_ignore true \
    --annotate \
    --vep_version 110.0 \
    --vep_cache_version 110 \
    --annotsv_annotations "${genomes_base}/Hsapiens/GRCh38.p14/variation/AnnotSV/AnnotSV-3.3.4.tar.gz" \
    --vcfanno_resources "/kyukon/data/gent/vo/000/gvo00082/WGS/custom_files/[!dbVar]*"

Output: 

WARN: Killing running tasks (5)
Join mismatch for the following entries: 
- key=[id:D2125260, sample:D2125260, family:D2125260, sex:female, family_count:1, variant_type:sv] values= 
- key=[id:D2200653, sample:D2200653, family:D2200653, sex:female, family_count:1, variant_type:sv] values=

Join mismatch for the following entries: 
- key=[id:D2018788, sample:D2018788, family:D2018788, sex:male, family_count:1, variant_type:sv] values= 
- key=[id:D2117741, sample:D2117741, family:D2117741, sex:male, family_count:1, variant_type:sv] values=

executor >  slurm (22)
[b0/2776d4] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 1 of 1, cached: 1 ✔
[1a/e88d7f] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 1 of 1, cached: 1 ✔
[a0/7da88f] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 2 of 2, cached: 2 ✔
[09/845236] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 2 of 2, cached: 2 ✔
[50/84ba0a] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[f5/b00c08] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 1 of 1, cached: 1 ✔
[b3/7ecd9d] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 1 of 1, cached: 1 ✔
[2b/baa2ea] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 12 of 12, cached: 12
[78/5057a2] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 12 of 12, cached: 12
[b7/7c8cb0] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[06/301dc7] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[cb/a7abbd] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[a1/70aaff] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[56/b3429c] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 12 of 12, cached:...
[be/8b46d4] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 12 of 12, cached: 11
[17/206ee6] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 12 of 12, cached: 11
[ff/bef45a] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 12 of 12, cached: 11
[1a/cee952] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[f3/870a88] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 8 of 8, cached: 8 ✔
[4d/e7e8bd] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 6 of 6, cached: 6 ✔
[26/8f6e89] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[c3/46954f] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[f9/a09f10] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[fc/016b86] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[dc/98318e] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[cd/ee0417] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[2c/15e1e1] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[94/ae531c] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[3c/9c7b8f] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[ff/0bc745] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[f6/48b8b2] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[1b/28e2b4] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 26 of 26, cached:...
[87/8e941f] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 33 of 33, cached:...
[29/e3a9dc] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 23 of 23, cached: 23
[75/5924cd] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 25 of 25, cached: 24
[f0/68a580] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 25 of 25, cached: 24
[6f/2faf20] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 23 of 23, cached: 23
[cd/f82150] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 23 of 23, cached: 23
[93/4d3850] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[7a/d90382] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[64/8caadb] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[f7/81d6e7] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... [100%] 14 of 14, cached:...
[-        ] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... -
[-        ] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... -
[-        ] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... -
[-        ] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... -
[-        ] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... -
[-        ] process > CMGG_CMGGSTRUCTURAL:CMGGSTR... -
ERROR ~ Error executing process > 'CMGG_CMGGSTRUCTURAL:CMGGSTRUCTURAL:VCF_ANNOTATE_VEP_ANNOTSV_VCFANNO:ANNOTSV_ANNOTSV (D2117741)'

Caused by:
  Process `CMGG_CMGGSTRUCTURAL:CMGGSTRUCTURAL:VCF_ANNOTATE_VEP_ANNOTSV_VCFANNO:ANNOTSV_ANNOTSV (D2117741)` terminated with an error exit status (134)

Command executed:

  AnnotSV \
      -annotationsDir annotsv_annotations \
       \
       \
       \
       \
      -outputFile D2117741.annot.tsv \
      -SVinputFile D2117741.filter.vcf \
      -vcf 1 -SVminSize 20

  mv *_AnnotSV/* .

  cat <<-END_VERSIONS > versions.yml
  "CMGG_CMGGSTRUCTURAL:CMGGSTRUCTURAL:VCF_ANNOTATE_VEP_ANNOTSV_VCFANNO:ANNOTSV_ANNOTSV":
      annotsv: $(echo $(AnnotSV -help 2>&1 | head -n1 | sed 's/^AnnotSV //'))
  END_VERSIONS

Command exit status:
  134

Command output:
  AnnotSV 3.3.6

  Copyright (C) 2017-2023 GEOFFROY Veronique

  Please feel free to contact me for any suggestions or bug reports
  email: veronique.geoffroy@inserm.fr

  Tcl/Tk version: 8.6

  Application name used:
  /usr/local

  ...downloading the configuration data (December 01 2023 - 14:55)
    ...configuration data by default
    ...configuration data from /usr/local/etc/AnnotSV/configfile
    ...configuration data given in arguments
    ...checking all these configuration data

  ...VCF to BED (December 01 2023 - 14:55)
    ...WARNING: 197997 sample IDs with missing alleles in the GT field (./. or .|.) have been reported in the "Samples_ID" output field

Command error:
  INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
  INFO:    Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
  INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
  max size for a Tcl value (2147483647 bytes) exceeded
  .command.sh: line 10:    43 Aborted                 (core dumped) AnnotSV -annotationsDir annotsv_annotations -outputFile D2117741.annot.tsv -SVinputFile D2117741.filter.vcf -vcf 1 -SVminSize 20

Work dir:
  /kyukon/scratch/gent/vo/000/gvo00082/vsc43079/VAL_batch1/structural/work/cf/1b73bfe151eb7e49faf0911b9f9e71

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '/kyukon/data/gent/vo/000/gvo00082/research/ICT/VAL/VAL_batch1/structural/.nextflow.log' file for details
[b8/887295] NOTE: Process `CMGG_CMGGSTRUCTURAL:CMGGSTRUCTURAL:VCF_ANNOTATE_VEP_ANNOTSV_VCFANNO:ANNOTSV_ANNOTSV (D2018788)` terminated with an error exit status (143) -- Execution is retried (3)


### Relevant files

See `$VSC_DATA_VO/research/ICT/VAL/VAL_batch1`

### System information

See jobscript
nvnieuwk commented 8 months ago

I'm guessing that we can fix this by creating a custom docker container with those dependencies in it. I'll give that a go later today or tomorrow

nvnieuwk commented 8 months ago

Should be fixed in #69