Closed mjafin closed 10 years ago
Can you check whether they're in the per-sample directories? I found them there.
Miika; I'm not sure, the unit tests seem to do the right thing, putting them in the sample directories at Luca mentions:
├── c-tumor
│ ├── c-tumor-freebayes.vcf.gz
│ ├── c-tumor-freebayes.vcf.gz.tbi
│ ├── c-tumor-mutect.vcf.gz
│ ├── c-tumor-mutect.vcf.gz.tbi
│ ├── c-tumor-varscan.vcf.gz
│ ├── c-tumor-varscan.vcf.gz.tbi
│ └── qc
└── c-tumor2
├── c-tumor2-freebayes.vcf.gz
├── c-tumor2-freebayes.vcf.gz.tbi
├── c-tumor2-mutect.vcf.gz
├── c-tumor2-mutect.vcf.gz.tbi
├── c-tumor2-varscan.vcf.gz
├── c-tumor2-varscan.vcf.gz.tbi
└── qc
Maybe posting your sample YAML file will help with identifying the differences with the test data so I can reproduce and fix. Thanks much.
Sorry about the slight delay in answering this, here's the config file:
details:
- algorithm:
aligner: bwa
background: /ngs/reference_data/genomes/Hsapiens/hg19/variation/refseq_exome_10bp_hg19_300_1kg_normal_panel.hg19.vcf
coverage_depth: high
coverage_interval: exome
mark_duplicates: false
platform: illumina
quality_format: Standard
realign: gatk
recalibrate: false
svcaller:
- cn.mops
variant_regions: /ngs/public_data/ERP002442/ERP002442-targeted_nonoverlap_hg19.bed
variantcaller:
- mutect
- freebayes
analysis: variant2
description: 10-497-N
files:
- /ngs/public_data/ERP002442/ERR256785_1.fastq.gz
- /ngs/public_data/ERP002442/ERR256785_2.fastq.gz
genome_build: hg19
metadata:
batch: 10-497-
phenotype: normal
- algorithm:
aligner: bwa
background: /ngs/reference_data/genomes/Hsapiens/hg19/variation/refseq_exome_10bp_hg19_300_1kg_normal_panel.hg19.vcf
coverage_depth: high
coverage_interval: exome
mark_duplicates: false
platform: illumina
quality_format: Standard
realign: gatk
recalibrate: false
svcaller:
- cn.mops
variant_regions: /ngs/public_data/ERP002442/ERP002442-targeted_nonoverlap_hg19.bed
variantcaller:
- mutect
- freebayes
analysis: variant2
description: 10-497-T
files:
- /ngs/public_data/ERP002442/ERR256786_1.fastq.gz
- /ngs/public_data/ERP002442/ERR256786_2.fastq.gz
genome_build: hg19
metadata:
batch: 10-497-
phenotype: tumor
fc_date: '2014-02-18'
fc_name: tumor-paired
upload:
dir: ../final
Here's the output folder structure:
.
├── 10-497-N
│ ├── 10-497-N-ready.bam
│ ├── 10-497-N-ready.bam.bai
│ └── qc
│ ├── bamtools
│ │ ├── bamtools_stats.txt
│ │ └── tx
│ └── fastqc
│ ├── fastqc_data.txt
│ ├── fastqc_report.html
│ ├── Icons
│ │ ├── error.png
│ │ ├── fastqc_icon.png
│ │ ├── tick.png
│ │ └── warning.png
│ ├── Images
│ │ ├── duplication_levels.png
│ │ ├── kmer_profiles.png
│ │ ├── per_base_gc_content.png
│ │ ├── per_base_n_content.png
│ │ ├── per_base_quality.png
│ │ ├── per_base_sequence_content.png
│ │ ├── per_sequence_gc_content.png
│ │ ├── per_sequence_quality.png
│ │ └── sequence_length_distribution.png
│ └── summary.txt
├── 10-497-T
│ ├── 10-497-T-freebayes.vcf.gz
│ ├── 10-497-T-freebayes.vcf.gz.tbi
│ ├── 10-497-T-ready.bam
│ ├── 10-497-T-ready.bam.bai
│ └── qc
│ ├── bamtools
│ │ ├── bamtools_stats.txt
│ │ └── tx
│ └── fastqc
│ ├── fastqc_data.txt
│ ├── fastqc_report.html
│ ├── Icons
│ │ ├── error.png
│ │ ├── fastqc_icon.png
│ │ ├── tick.png
│ │ └── warning.png
│ ├── Images
│ │ ├── duplication_levels.png
│ │ ├── kmer_profiles.png
│ │ ├── per_base_gc_content.png
│ │ ├── per_base_n_content.png
│ │ ├── per_base_quality.png
│ │ ├── per_base_sequence_content.png
│ │ ├── per_sequence_gc_content.png
│ │ ├── per_sequence_quality.png
│ │ └── sequence_length_distribution.png
│ └── summary.txt
└── 2014-02-18_tumor-paired
├── programs.txt
└── project-summary.yaml
Ah, actually, it looks like something must've gone wrong with mutect+SID variant calling, as the mutect/ folder doesn't have the vcf files in there..
I'll rerun the data.
Well, I started a run from scratch and here's what it does before finishing (no mention of mutect):
2014-03-10 15:57:33.475 [IPClusterStop] Stopping cluster [pid=28740] with [signal=2]
[2014-03-10 15:57] ukapdlnx115: Timing: finished
[2014-03-10 15:57] ukapdlnx115: Storing directory in local filesystem: /scratch/ukapd/klrl262/ERP002442/tumor-paired/final/10-497-T/qc
[2014-03-10 15:57] ukapdlnx115: Storing in local filesystem: /scratch/ukapd/klrl262/ERP002442/tumor-paired/final/10-497-T/10-497-T-ready.bam
[2014-03-10 15:57] ukapdlnx115: Storing in local filesystem: /scratch/ukapd/klrl262/ERP002442/tumor-paired/final/10-497-T/10-497-T-ready.bam.bai
[2014-03-10 15:57] ukapdlnx115: Storing in local filesystem: /scratch/ukapd/klrl262/ERP002442/tumor-paired/final/10-497-T/10-497-T-freebayes.vcf.gz
[2014-03-10 15:57] ukapdlnx115: Storing in local filesystem: /scratch/ukapd/klrl262/ERP002442/tumor-paired/final/10-497-T/10-497-T-freebayes.vcf.gz.tbi
[2014-03-10 15:57] ukapdlnx115: Storing in local filesystem: /scratch/ukapd/klrl262/ERP002442/tumor-paired/final/2014-02-18_tumor-paired/programs.txt
[2014-03-10 15:57] ukapdlnx115: Storing in local filesystem: /scratch/ukapd/klrl262/ERP002442/tumor-paired/final/2014-02-18_tumor-paired/project-summary.yaml
[2014-03-10 15:57] ukapdlnx115: Storing directory in local filesystem: /scratch/ukapd/klrl262/ERP002442/tumor-paired/final/10-497-N/qc
[2014-03-10 15:57] ukapdlnx115: Storing in local filesystem: /scratch/ukapd/klrl262/ERP002442/tumor-paired/final/10-497-N/10-497-N-ready.bam
[2014-03-10 15:57] ukapdlnx115: Storing in local filesystem: /scratch/ukapd/klrl262/ERP002442/tumor-paired/final/10-497-N/10-497-N-ready.bam.bai
These guys are found in the mutect work folder:
[klrl262@ukapdlnx115: /scratch/ukapd/klrl262/ERP002442/tumor-paired/work ]$ l mutect/
total 1672
drwxr-xr-x 96 klrl262 modeller 8192 Mar 10 15:40 .
drwxr-xr-x 16 klrl262 modeller 4096 Mar 10 15:57 ..
-rw-r--r-- 1 klrl262 modeller 217231 Mar 10 15:40 2_2014-02-18_tumor-paired-sort-variants-effects.vcf.gz
-rw-r--r-- 1 klrl262 modeller 16311 Mar 10 15:40 2_2014-02-18_tumor-paired-sort-variants-effects.vcf.gz.tbi
-rw-r--r-- 1 klrl262 modeller 35554 Mar 10 15:38 2_2014-02-18_tumor-paired-sort-variants.vcf-files.txt
-rw-r--r-- 1 klrl262 modeller 130266 Mar 10 15:38 2_2014-02-18_tumor-paired-sort-variants.vcf.gz
-rw-r--r-- 1 klrl262 modeller 15933 Mar 10 15:38 2_2014-02-18_tumor-paired-sort-variants.vcf.gz.tbi
These are from FreeBayes work folder:
-rw-r--r-- 1 klrl262 modeller 457679 Mar 10 15:40 2_2014-02-18_tumor-paired-sort-variants-filter-effects.vcf.gz
-rw-r--r-- 1 klrl262 modeller 14328 Mar 10 15:40 2_2014-02-18_tumor-paired-sort-variants-filter-effects.vcf.gz.tbi
-rw-r--r-- 1 klrl262 modeller 393424 Mar 10 15:40 2_2014-02-18_tumor-paired-sort-variants-filter.vcf.gz
-rw-r--r-- 1 klrl262 modeller 36304 Mar 10 15:38 2_2014-02-18_tumor-paired-sort-variants.vcf-files.txt
-rw-r--r-- 1 klrl262 modeller 1161712 Mar 10 15:38 2_2014-02-18_tumor-paired-sort-variants.vcf.gz
-rw-r--r-- 1 klrl262 modeller 18804 Mar 10 15:38 2_2014-02-18_tumor-paired-sort-variants.vcf.gz.tbi
Miika; Ugh, this one is driving me crazy because I can't reproduce at all with a small test set. Could you try running the multiple caller test in the test suite and see if it behaves correctly for you:
./run_tests.sh cancermulti
tree test_automated_output/upload
I'm not sure why your example is acting differently and just trying to isolate if it's some kind of system problem or something else. Sorry about the issue and hope this sheds some light.
Thanks Brad, I'll run that first thing tomorrow.. Locked out of my account atm so can't work from home (the wife is delighted)
I was running this on the two ERP002442 samples
The test runs fine.. So weird. I'll look into this further by adding some debugging markers in the code that generates the final folder.
Miika;
Strange, I also can't reproduce with the cancer test dataset, using https://bcbio-nextgen.readthedocs.org/en/latest/contents/testing.html#cancer-tumor-normal with the only change adjusting variantcallers: [freebayes, mutect]
. On a fresh run, the output directory looks like:
$ tree -L 2 ../final
../final
├── 2014-01-06_cancer
│ ├── batch1-freebayes.db
│ ├── batch1-mutect.db
│ ├── programs.txt
│ └── project-summary.yaml
├── ERR256785
│ ├── ERR256785-ready.bam
│ ├── ERR256785-ready.bam.bai
│ └── qc
└── ERR256786
├── ERR256786-freebayes.vcf.gz
├── ERR256786-freebayes.vcf.gz.tbi
├── ERR256786-mutect.vcf.gz
├── ERR256786-mutect.vcf.gz.tbi
├── ERR256786-ready.bam
├── ERR256786-ready.bam.bai
└── qc
Sorry about the issues and all the back and forth. There must be something I'm missing to reproduce but I can't see it right now.
OK so did some further testing. If I specify mutect
, freebayes
and cn.mops
, I only get freebayes
vcf in the final
folder. However, if I drop cn.mops
, both mutect
and freebayes
get handled correctly. I don't plan on using cn.mops
at this point in time, but something must be going wrong with it.
Miika; To add a final point of confusion to this thread, I can't replicate this even if I have cn.mops added as well. So now I'm officially confused. Since cn.mops is still experimental and needs lots of validation I'll leave this for now and we can return to it if we see the issue in the future. Sorry again about the issues and my inability to reproduce.
Hi Brad, I combined FreeBayes and MuTect in paired variant calling, and the run was fine but only the FreeBayes vcf ended up being copied into the final folder. Any ideas what might be wrong?