bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
987 stars 354 forks source link

Questions : use of joint calling with octopus and gnomad genome #3683

Open Titos789 opened 1 year ago

Titos789 commented 1 year ago

Hi,

Just used bcbio-nextgen for the first time. We would like to use octopus for 10 WGS variant calling and would like to have a multisample vcf. In the bcbio next-gen doc we saw that the jointcaller option supports with HaplotypeCaller and FreeBayes. Is there a possibility to jointly call the variants with octopus?

My second question concerns the use of gnomad genome. Unless I'm mistaken, by default bcbio next-gen uses gnomad exome. What command should I add in the config file to also have the frequencies of gomad genome. For annotation I use vep.

Thanks in advance,


details:
  - analysis: variant2
    genome_build: hg38
    description:
    metadata:
      batch:
      phenotype:
    algorithm:
      aligner: bwa
      mark_duplicates: true
      recalibrate: gatk
      realign: false
      variantcaller: gatk-haplotype
      jointcaller: gatk-haplotype-joint
      effects: vep
      effects_transcripts: all
      vcfanno: [gemini,eog,dbscsnv,dbnsfp]
      tools_on:
        - vep_splicesite_annotations
    files: 
naumenko-sa commented 1 year ago

Hi @Titos789 !

A bit of terminology clarification: jointcaller in bcbio is used to joint genotype large (>30, >100 samples) cohorts: https://bcbio-nextgen.readthedocs.io/en/latest/contents/germline_variants.html#workflow3-population-calling

In your case of 10 samples, you may use batch or cohort calling, i.e. when all samples are fed as input to the caller (it is called joint in the octopus documentation): https://luntergroup.github.io/octopus/docs/guides/models/population.

So you may just try to run it without jointcaller, but specifying a batch: https://raw.githubusercontent.com/bcbio/bcbio-nextgen/master/config/examples/NA12878-trio-wgs-validate.yaml https://bcbio-nextgen.readthedocs.io/en/latest/contents/germline_variants.html#workflow4-whole-genome-trio-50x-hg38

Please report back the usage of Octopus - it was updated in 1.2.9.

To use gnomad AF you'd need to install it with: bcbio_nextgen.py upgrade -u skip --genomes hg38 ---datatarget gnomad

https://bcbio-nextgen.readthedocs.io/en/latest/contents/installation.html#customizing-data-installation

SN