bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
990 stars 354 forks source link

Gemini by sample not cohort #794

Closed dwaggott closed 9 years ago

dwaggott commented 9 years ago

My gemini directory (joint calling n=24) has vcf's and db's for each sample. In past runs it only contained the multisample vcf and db. Also, each of these db's was put into the final directory. Does that seem expected? The pipeline was restarted so it could have been a failure somewhere.

i.e.

-rw-r--r-- 1 dwaggott euan  21K Mar 18 15:14 1_17-freebayes-joint-multiallelic-decompose-effects.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan 190K Mar 18 15:14 1_17-freebayes-joint-multiallelic-decompose-effects.vcf.gz
-rw-r--r-- 1 dwaggott euan  53M Mar 18 15:15 1_17-freebayes-joint-nomultiallelic.vcf.gz
-rw-r--r-- 1 dwaggott euan 1.2M Mar 18 15:15 1_17-freebayes-joint-nomultiallelic.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan   98 Mar 18 15:15 1_17-freebayes-joint-nomultiallelic.ped
-rw-r--r-- 1 dwaggott euan  787 Mar 18 15:15 1_17-freebayes-joint-nomultiallelic.vcf.gz.gbi
-rw-r--r-- 1 dwaggott euan 2.0G Mar 18 15:17 1_18-freebayes-joint.db
lrwxrwxrwx 1 dwaggott euan   53 Mar 18 15:17 1_29-freebayes-joint.vcf.gz.tbi -> ../freebayes/1_29-effects-ploidyfix-filter.vcf.gz.tbi
lrwxrwxrwx 1 dwaggott euan   49 Mar 18 15:17 1_29-freebayes-joint.vcf.gz -> ../freebayes/1_29-effects-ploidyfix-filter.vcf.gz
-rw-r--r-- 1 dwaggott euan 1.1M Mar 18 15:18 1_29-freebayes-joint-biallelic.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan  49M Mar 18 15:18 1_29-freebayes-joint-biallelic.vcf.gz
-rw-r--r-- 1 dwaggott euan  17K Mar 18 15:19 1_29-freebayes-joint-multiallelic.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan 151K Mar 18 15:19 1_29-freebayes-joint-multiallelic.vcf.gz
-rw-r--r-- 1 dwaggott euan 186K Mar 18 15:19 1_29-freebayes-joint-multiallelic-decompose.vcf.gz
-rw-r--r-- 1 dwaggott euan  19K Mar 18 15:19 1_29-freebayes-joint-multiallelic-decompose.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan 2.0G Mar 18 15:19 1_14-freebayes-joint.db
lrwxrwxrwx 1 dwaggott euan   53 Mar 18 15:19 1_09-freebayes-joint.vcf.gz.tbi -> ../freebayes/1_09-effects-ploidyfix-filter.vcf.gz.tbi
lrwxrwxrwx 1 dwaggott euan   49 Mar 18 15:19 1_09-freebayes-joint.vcf.gz -> ../freebayes/1_09-effects-ploidyfix-filter.vcf.gz
-rw-r--r-- 1 dwaggott euan 2.2G Mar 18 15:19 1_11-freebayes-joint.db
lrwxrwxrwx 1 dwaggott euan   53 Mar 18 15:20 1_25-freebayes-joint.vcf.gz.tbi -> ../freebayes/1_25-effects-ploidyfix-filter.vcf.gz.tbi
lrwxrwxrwx 1 dwaggott euan   49 Mar 18 15:20 1_25-freebayes-joint.vcf.gz -> ../freebayes/1_25-effects-ploidyfix-filter.vcf.gz
-rw-r--r-- 1 dwaggott euan  18K Mar 18 15:20 1_29-freebayes-joint-multiallelic-decompose-effects.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan 164K Mar 18 15:20 1_29-freebayes-joint-multiallelic-decompose-effects.vcf.gz
-rw-r--r-- 1 dwaggott euan  56M Mar 18 15:20 1_09-freebayes-joint-biallelic.vcf.gz
-rw-r--r-- 1 dwaggott euan 1.2M Mar 18 15:20 1_09-freebayes-joint-biallelic.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan  47M Mar 18 15:20 1_29-freebayes-joint-nomultiallelic.vcf.gz
-rw-r--r-- 1 dwaggott euan 1.1M Mar 18 15:20 1_29-freebayes-joint-nomultiallelic.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan   98 Mar 18 15:20 1_29-freebayes-joint-nomultiallelic.ped
-rw-r--r-- 1 dwaggott euan 1.2M Mar 18 15:21 1_25-freebayes-joint-biallelic.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan  63M Mar 18 15:21 1_25-freebayes-joint-biallelic.vcf.gz
-rw-r--r-- 1 dwaggott euan  675 Mar 18 15:21 1_29-freebayes-joint-nomultiallelic.vcf.gz.gbi
-rw-r--r-- 1 dwaggott euan  19K Mar 18 15:21 1_09-freebayes-joint-multiallelic.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan 177K Mar 18 15:21 1_09-freebayes-joint-multiallelic.vcf.gz
-rw-r--r-- 1 dwaggott euan  21K Mar 18 15:21 1_09-freebayes-joint-multiallelic-decompose.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan 219K Mar 18 15:21 1_09-freebayes-joint-multiallelic-decompose.vcf.gz
-rw-r--r-- 1 dwaggott euan  20K Mar 18 15:21 1_25-freebayes-joint-multiallelic.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan 189K Mar 18 15:21 1_25-freebayes-joint-multiallelic.vcf.gz
-rw-r--r-- 1 dwaggott euan  23K Mar 18 15:21 1_25-freebayes-joint-multiallelic-decompose.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan 235K Mar 18 15:21 1_25-freebayes-joint-multiallelic-decompose.vcf.gz
-rw-r--r-- 1 dwaggott euan 1.8G Mar 18 15:21 1_32-freebayes-joint.db
-rw-r--r-- 1 dwaggott euan  21K Mar 18 15:22 1_09-freebayes-joint-multiallelic-decompose-effects.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan 192K Mar 18 15:22 1_09-freebayes-joint-multiallelic-decompose-effects.vcf.gz
-rw-r--r-- 1 dwaggott euan  23K Mar 18 15:22 1_25-freebayes-joint-multiallelic-decompose-effects.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan 207K Mar 18 15:22 1_25-freebayes-joint-multiallelic-decompose-effects.vcf.gz
-rw-r--r-- 1 dwaggott euan  54M Mar 18 15:23 1_09-freebayes-joint-nomultiallelic.vcf.gz
-rw-r--r-- 1 dwaggott euan 1.2M Mar 18 15:23 1_09-freebayes-joint-nomultiallelic.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan   98 Mar 18 15:23 1_09-freebayes-joint-nomultiallelic.ped
-rw-r--r-- 1 dwaggott euan  801 Mar 18 15:23 1_09-freebayes-joint-nomultiallelic.vcf.gz.gbi
-rw-r--r-- 1 dwaggott euan  60M Mar 18 15:23 1_25-freebayes-joint-nomultiallelic.vcf.gz
-rw-r--r-- 1 dwaggott euan 1.2M Mar 18 15:23 1_25-freebayes-joint-nomultiallelic.vcf.gz.tbi
-rw-r--r-- 1 dwaggott euan   98 Mar 18 15:23 1_25-freebayes-joint-nomultiallelic.ped
-rw-r--r-- 1 dwaggott euan  885 Mar 18 15:23 1_25-freebayes-joint-nomultiallelic.vcf.gz.gbi
-rw-r--r-- 1 dwaggott euan 1.8G Mar 18 15:32 2_01-freebayes-joint.db
-rw-r--r-- 1 dwaggott euan 1.8G Mar 18 15:34 1_17-freebayes-joint.db
-rw-r--r-- 1 dwaggott euan 3.6G Mar 18 15:37 2_03-freebayes-joint.db
-rw-r--r-- 1 dwaggott euan 1.6G Mar 18 15:38 1_29-freebayes-joint.db
chapmanb commented 9 years ago

Daryl; Sorry about the issue. I tried to reproduce this with the test case (./run_tests.sh joint) but couldn't. Is it possible you have joint calling specified but don't have the samples in a shared batch within the metadata? I'll try to think more on possible issues but that's the first that comes to mind. Hope this explains it.

dwaggott commented 9 years ago

Ugh, I sure am good at doing everything wrong.

I put batch in the sample.csv file but forgot to add it to the template config yaml. It needs to be in both to end up in the final sample yaml, correct?

On Thu, Mar 19, 2015 at 6:05 PM, Brad Chapman notifications@github.com wrote:

Daryl; Sorry about the issue. I tried to reproduce this with the test case (./run_tests.sh joint) but couldn't. Is it possible you have joint calling specified but don't have the samples in a shared batch within the metadata? I'll try to think more on possible issues but that's the first that comes to mind. Hope this explains it.

— Reply to this email directly or view it on GitHub https://github.com/chapmanb/bcbio-nextgen/issues/794#issuecomment-83834774 .

dwaggott commented 9 years ago

Where does batch come into play for the joint pipeline. If it didn't make it into the final yaml, could I start the pipeline at the gemini time point?

On Fri, Mar 20, 2015 at 2:55 PM, Daryl Waggott dwaggott@gmail.com wrote:

Ugh, I sure am good at doing everything wrong.

I put batch in the sample.csv file but forgot to add it to the template config yaml. It needs to be in both to end up in the final sample yaml, correct?

On Thu, Mar 19, 2015 at 6:05 PM, Brad Chapman notifications@github.com wrote:

Daryl; Sorry about the issue. I tried to reproduce this with the test case (./run_tests.sh joint) but couldn't. Is it possible you have joint calling specified but don't have the samples in a shared batch within the metadata? I'll try to think more on possible issues but that's the first that comes to mind. Hope this explains it.

— Reply to this email directly or view it on GitHub https://github.com/chapmanb/bcbio-nextgen/issues/794#issuecomment-83834774 .

chapmanb commented 9 years ago

Daryl; Putting the batch in the sample.csv should be all you need to do, and it'll end up in the final sample YAML. I couldn't reproduce an issue with it being left out running the unit test (./run_tests.sh template) but please let me know if I'm missing anything.

Regarding re-running, you'll need to re-do the joint and GEMINI analysis, so remove both of those directories and checkpoints/multicores2.done and you should be able to re-run. Hope this fixes it for you.