bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
994 stars 354 forks source link

bcbio and heterogeneity #1229

Closed apastore closed 8 years ago

apastore commented 8 years ago

Sorry, I am curious to know which parameter must be set in the project.yam in order to run phylowgs and bubble tree?

thanks!

chapmanb commented 8 years ago

Alessandro; There is a hetcaller target which specifies these. However, it's currently undocumented as they're not completely stable and validated. PhyloWGS still needs work on the Battenberg caller as input. BubbleTree should work with CNVkit as an input:

svcaller: cnvkit
hetcaller: bubbletree

but we're not yet sure how well tuned the outputs are because of lack of good validation sets. Sorry about all the cavaets but hope this helps with the current status.

apastore commented 8 years ago

Thanks a lot it was curiosity

Sent from my iPhone

On Feb 17, 2016, at 7:48 PM, Brad Chapman notifications@github.com wrote:

Alessandro; There is a hetcaller target which specifies these. However, it's currently undocumented as they're not completely stable and validated. PhyloWGS still needs work on the Battenberg caller as input. BubbleTree should work with CNVkit as an input:

svcaller: cnvkit hetcaller: bubbletree but we're not yet sure how well tuned the outputs are because of lack of good validation sets. Sorry about all the cavaets but hope this helps with the current status.

— Reply to this email directly or view it on GitHub.

choosehappy commented 6 years ago

I was wondering if any updates have been made in this?

I see that the code is still in place, but when enabling the hetcaller: bubbletree option, in the end, no output was produced (also no "bubble" tree at all in the command log)

chapmanb commented 6 years ago

Thanks much for trying this out. bubbletree should still be supported if you have CNVkit calling in your processing as well. Would you be able to share your configuration where it gets skipped? What kind of inputs do you have: tumor-only or tumor/normal paired?

Practically, we're working on improving heterogeneity calling and recently added TitanCNA support which we'll match up existing PhyloWGS support soon. These are for tumor/normal paired samples only, though, and next steps are to integerate PureCN to handle the tumor-only cases.

Thanks again for testing these.

choosehappy commented 6 years ago

These are tumor/normal paired samples

below you can find the yaml file, as you see, i've launched ALL the missles :)

(interally we're doing some testing)

details:
- algorithm:
    aligner: bwa
    effects: vep
    ensemble:
      numpass: 2
    hetcaller: bubbletree
    hlacaller: optitype
    indelcaller: false
    kraken: minikraken
    mark_duplicates: true
    realign: true
    recalibrate: true
    svcaller:
    - cnvkit
    - lumpy
    - manta
    svprioritize: cancer/civic
    tools_off: gemini
    tools_on:
    - lumpy_usecnv
    - bnd-genotype
    - svplots
    - qualimap
    variantcaller:
    - vardict
    - mutect
    - freebayes
    - varscan
  analysis: variant2
  description: N1
  files:
  - /data/CNV.valid/N1.R1.fastq.gz
  - /data/CNV.valid/N1.R2.fastq.gz
  genome_build: hg38
  metadata:
    batch: b1
    phenotype: normal
- algorithm:
    aligner: bwa
    effects: vep
    ensemble:
      numpass: 2
    hetcaller: bubbletree
    hlacaller: optitype
    indelcaller: false
    kraken: minikraken
    mark_duplicates: true
    realign: true
    recalibrate: true
    svcaller:
    - cnvkit
    - lumpy
    - manta
    svprioritize: cancer/civic
    tools_off: gemini
    tools_on:
    - lumpy_usecnv
    - bnd-genotype
    - svplots
    - qualimap
    variantcaller:
    - vardict
    - mutect
    - freebayes
    - varscan
  analysis: variant2
  description: T1
  files:
  - /data/CNV.valid/T1.R1.fastq.gz
  - /data/CNV.valid/T1.R2.fastq.gz
  genome_build: hg38
  metadata:
    batch: b1
    phenotype: tumor
fc_name: P1
upload:
  dir: ../final

not sure where its supposed to be run, but the command log has no mention of bubbletree:

root@sib-pc25:/export/big/ajanowcz/CNV.valid/P1/final/2017-11-29_P1# cat bcbio-nextgen-commands.log | grep bub
root@sib-pc25:/export/big/ajanowcz/CNV.valid/P1/final/2017-11-29_P1#

as well one of the outputs is a pdf but no relevant ones seems produced:

root@sib-pc25:/export/big/ajanowcz/CNV.valid/P1# find . | grep pdf
./work/align/N1/hla/OptiType-HLA-A_B_C/2017_11_28_01_39_39/2017_11_28_01_39_39_coverage_plot.pdf
./work/align/T1/hla/OptiType-HLA-A_B_C/2017_11_28_01_49_27/2017_11_28_01_49_27_coverage_plot.pdf
./work/structural/T1/bins/T1-normalized-diagram.pdf
./work/structural/T1/bins/T1-normalized-scatter.pdf
./work/structural/T1/bins/T1-normalized-scatter_global.pdf

Is there something else i can show you to help?

chapmanb commented 6 years ago

Thanks for all the details. You're right that this should run bubbletree analysis and after reading the code and matching to what you have I'm not sure why it isn't. What version of bcbio are you using? If it's not the latest is it possible to test with this to see if it still skips running, or if that allows it to go? If it runs you should see a heterogeneity/T1/bubbletree directory in your work directory. Thanks much for the help debugging.

choosehappy commented 6 years ago

this was using a very recent version, the one you're helping to debug over in #2171.

is this version informative enough:

bcbio_nextgen.py --version
1.0.7a

In this case, i removed the sv_plots param so that the run would complete, and i can report there is no bubbletree files or directories created anywhere:

root@51853b033dc8:/data/CNV.valid/P1# find . | grep bubble
root@51853b033dc8:/data/CNV.valid/P1# 

nor was it in the command logs or the final log:

root@51853b033dc8:/data/CNV.valid/P1/final/2017-12-05_P1# cat bcbio-nextgen-commands.log | grep bub
root@51853b033dc8:/data/CNV.valid/P1/final/2017-12-05_P1#
root@51853b033dc8:/data/CNV.valid/P1/final/2017-12-05_P1# cat bcbio-nextgen.log | grep bub
root@51853b033dc8:/data/CNV.valid/P1/final/2017-12-05_P1#

any ideas?

chapmanb commented 6 years ago

Thanks much for following up and checking on the version. I managed to reproduce this and realized the issue was due to the lumpy_usecnv option in your configuration. Apologies for missing this earlier. We didn't do a great job of handling this since it shifts around when CNVkit gets called, and I pushed a fix to resolve that issue. If you update to the latest development and re-run it should hopefully cleanly kick off BubbleTree. Please let us know if you run into any other issues at all and thanks again.