Open imtiyazhariyani opened 6 years ago
@imtiyazhariyani , let me check this out. Its not immediately apparent whats going on.
Fixed - there were a few issues with the cohort calling rule.
- cohort_calling:
local:
# Should be {$self->combine_dir}
- indir: "{$self->{combine_dir}"
- outdir: "{$self->{combine_dir}"
# Should be {$self-> ... }
- INPUT: "{self->combine_dir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
Should be
- cohort_calling:
local:
- indir: "{$self->combine_dir}"
- outdir: "{$self->combine_dir}"
- INPUT: "{$self->indir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
Here's the whole thing.
---
global:
# Initial Directory Setup
- indir: "data/analysis_imtiyaz"
- outdir: "data/analysis_imtiyaz/cohort"
# indir/outdir is a chained variable
- root_in_dir: "data/analysis"
- root_out_dir: "data/analysis/cohort"
# Find Samples
- sample_glob: "data/analysis_imtiyaz/Sample*/gatk/*_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
- by_sample_outdir: '1'
# Analysis Dirs
- combine_dir: "data/analysis_imtiyaz"
# Reference Data
- bwa_mem_reference: "/scratch/gencore/160713_SN7001341_0131_AC8YL9ACXX/Unaligned/Project_Boissinot_lab/data/analysis/reference/leptopelis_transcriptrinity"
- reference: "{$self->bwa_mem_reference}.fa"
# HPC Directives
- HPC:
- account: 'ieh211'
- partition: 'serial'
- module: 'gencore gencore_dev gencore_variant_detection/1.0'
- cpus_per_task: 1
- commands_per_node: 1
rules:
- stash_samples:
local:
- override_process: 1
- create_outdir: 0
process: |-
{
use File::Glob;
use Cwd;
my @glob = glob(cwd().'/'. $self->sample_glob);
$self->stash->{sample_files} = \@glob;
($SILENTLY);
}
- combine_gvcf:
local:
- override_process: 1
- indir: "{$self->combine_dir}"
- outdir: "{$self->combine_dir}"
- OUTPUT: "{$self->outdir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
- HPC:
- deps: 'stash_samples'
- walltime: '48:00:00'
- mem: '40GB'
- process_mustache: |
gatk -Xmx80G -T CombineGVCFs \
-R {{{reference}}} \
{{#stash.sample_files}}
--variant {{{.}}} \
{{/stash.sample_files}}
-o {{{outdir}}}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf
process: |-
{
$OUT .= $self->render_mustache($self->process_mustache);
}
- cohort_calling:
local:
- indir: "{$self->combine_dir}"
- outdir: "{$self->combine_dir}"
- INPUT: "{$self->indir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
- OUTPUT: "{$self->outdir}/ALL_SAMPLES_cohort.haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
- process_mustache: |
gatk -T GenotypeGVCFs \
-R {{{reference}}} \
-stand_call_conf '30' \
-o {{{outdir}}}/ALL_SAMPLES _cohort.haplotype.realigned.withrg.csorted.cleaned.aligned.vcf
- HPC:
- deps: 'combine_gvcf'
- walltime: '48:00:00'
- mem: '40GB'
process: |-
{
$OUT .= $self->render_mustache($self->process_mustache);
}
Oops that was fairly straightforward! It works now, thank you @jerowe
Tried running the GATK Cohort workflow from BioSAILS without running it per chromosome. The workflow executes and produces an s batch script but with no rules. Samples are found. However, the following line is displayed when I run the biox command:
"Path::Tiny paths require defined, positive-length parts at /scratch/gencore/.local/easybuild/software/gencore_dev/1.0/lib/perl5/site_perl/5.22.0/BioX/Workflow/Command/run/Rules/Directives/Types/Path.pm line 183."
Below is the yml script:
`--- global:
Initial Directory Setup
it gets changed within a rule
rules:
stash_samples: local:
combine_gvcf: local:
cohort_calling: local: