biosails / BioX-Workflow-Command

Other
0 stars 0 forks source link

GATK Cohort Calling Stash Not Working #47

Open imtiyazhariyani opened 6 years ago

imtiyazhariyani commented 6 years ago

Tried running the GATK Cohort workflow from BioSAILS without running it per chromosome. The workflow executes and produces an s batch script but with no rules. Samples are found. However, the following line is displayed when I run the biox command:

"Path::Tiny paths require defined, positive-length parts at /scratch/gencore/.local/easybuild/software/gencore_dev/1.0/lib/perl5/site_perl/5.22.0/BioX/Workflow/Command/run/Rules/Directives/Types/Path.pm line 183."

Below is the yml script:

`--- global:

Initial Directory Setup

- indir:      "data/analysis_imtiyaz"
- outdir:     "data/analysis_imtiyaz/cohort"
# indir/outdir is a chained variable

it gets changed within a rule

- root_in_dir: "data/analysis"
- root_out_dir: "data/analysis/cohort"
# Find Samples
- sample_glob: "data/analysis_imtiyaz/Sample*/gatk/*_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
- by_sample_outdir: '1'
# Analysis Dirs
- combine_dir: "data/analysis_imtiyaz"
# Reference Data
- bwa_mem_reference: "/scratch/gencore/160713_SN7001341_0131_AC8YL9ACXX/Unaligned/Project_Boissinot_lab/data/analysis/reference/leptopelis_transcriptrinity"
- reference: "{$self->bwa_mem_reference}.fa"
# HPC Directives
- HPC:
   - account: 'ieh211'
   - partition: 'serial'
   - module:  'gencore gencore_dev gencore_variant_detection/1.0'
   - cpus_per_task: 1
   - commands_per_node: 1

rules:

jerowe commented 6 years ago

@imtiyazhariyani , let me check this out. Its not immediately apparent whats going on.

jerowe commented 6 years ago

Fixed - there were a few issues with the cohort calling rule.

- cohort_calling:
    local:
             # Should be {$self->combine_dir}
            - indir: "{$self->{combine_dir}"
            - outdir: "{$self->{combine_dir}"
            # Should be {$self-> ... }
            - INPUT: "{self->combine_dir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"

Should be

      - cohort_calling:
          local:
              - indir: "{$self->combine_dir}"
              - outdir: "{$self->combine_dir}"
              - INPUT: "{$self->indir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
jerowe commented 6 years ago

Here's the whole thing.

---
global:
    # Initial Directory Setup
    - indir: "data/analysis_imtiyaz"
    - outdir: "data/analysis_imtiyaz/cohort"
    # indir/outdir is a chained variable
    - root_in_dir: "data/analysis"
    - root_out_dir: "data/analysis/cohort"
    # Find Samples
    - sample_glob: "data/analysis_imtiyaz/Sample*/gatk/*_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
    - by_sample_outdir: '1'
    # Analysis Dirs
    - combine_dir: "data/analysis_imtiyaz"
    # Reference Data
    - bwa_mem_reference: "/scratch/gencore/160713_SN7001341_0131_AC8YL9ACXX/Unaligned/Project_Boissinot_lab/data/analysis/reference/leptopelis_transcriptrinity"
    - reference: "{$self->bwa_mem_reference}.fa"
    # HPC Directives
    - HPC:
      - account: 'ieh211'
      - partition: 'serial'
      - module:  'gencore gencore_dev gencore_variant_detection/1.0'
      - cpus_per_task: 1
      - commands_per_node: 1
rules:
      - stash_samples:
          local:
              - override_process: 1
              - create_outdir: 0
          process: |-
              {
                use File::Glob;
                use Cwd;
                my @glob = glob(cwd().'/'. $self->sample_glob);
                $self->stash->{sample_files} = \@glob;
                ($SILENTLY);
              }
      - combine_gvcf:
          local:
              - override_process: 1
              - indir: "{$self->combine_dir}"
              - outdir: "{$self->combine_dir}"
              - OUTPUT: "{$self->outdir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
              - HPC:
                   - deps: 'stash_samples'
                   - walltime: '48:00:00'
                   - mem: '40GB'
              - process_mustache: |
                 gatk -Xmx80G -T CombineGVCFs \
                 -R {{{reference}}} \
                 {{#stash.sample_files}}
                   --variant {{{.}}} \
                 {{/stash.sample_files}}
                 -o {{{outdir}}}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf
          process: |-
              {
                $OUT .= $self->render_mustache($self->process_mustache);
              }
      - cohort_calling:
          local:
              - indir: "{$self->combine_dir}"
              - outdir: "{$self->combine_dir}"
              - INPUT: "{$self->indir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
              - OUTPUT: "{$self->outdir}/ALL_SAMPLES_cohort.haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
              - process_mustache: |
                  gatk -T GenotypeGVCFs \
                    -R {{{reference}}} \
                    -stand_call_conf '30' \
                    -o {{{outdir}}}/ALL_SAMPLES _cohort.haplotype.realigned.withrg.csorted.cleaned.aligned.vcf
              - HPC:
                - deps: 'combine_gvcf'
                - walltime: '48:00:00'
                - mem: '40GB'
          process: |-
            {
              $OUT .= $self->render_mustache($self->process_mustache);
            }
imtiyazhariyani commented 6 years ago

Oops that was fairly straightforward! It works now, thank you @jerowe