biosails / BioX-Workflow-Command

Other
0 stars 0 forks source link

Directories are not being created when running biox run #19

Closed nizardrou closed 7 years ago

nizardrou commented 7 years ago

I recently added jellyfish to the QC workflow, but when I run biox run, it does not create the directory structure (data/analysis/Sample_*).

The workflow is at, /scratch/gencore/170321_SN7001341_0225_AC9V19ACXX/Unaligned/Project_Stephane_Boissinot/de_novo_genome_large_multi-libraries.yml

jerowe commented 7 years ago

You didn't specify -outdir in your local rule, and so biox does not create it. I added in the 'outdir', and I will add in a warning for not specifying outdir.

jerowe commented 7 years ago

Ok, they are still not being created. This is a really weird bug.

---
global:
  - indir: data/processed
  - outdir: data/analysis
  - analysis_dir: data/analysis
  - root: data/analysis
  - trimmomatic_dir: 'data/processed/{$sample}/trimmomatic'
  - trimmomatic: 'data/processed/{$sample}/trimmomatic'
  - raw_fastqc_dir: 'data/processed/{$sample}'
  - raw_fastqc: 'data/processed/{$sample}'
  - sample_rule: (Sample.*)$
  - by_sample_outdir: 1
  - find_by_dir: 1
  - wait: 0
  - READ1: '{$self->raw_fastqc_dir}/{$sample}_read1.fastq.gz'
  - READ2: '{$self->raw_fastqc_dir}/{$sample}_read2.fastq.gz'
  - TR1: '{$self->trimmomatic_dir}/{$sample}_read1_trimmomatic'
  - TR2: '{$self->trimmomatic_dir}/{$sample}_read2_trimmomatic'
  - jellyfish_dir: 'data/JANALYSIS/{$sample}/jellyfish'
  - HPC:
      - account: gencore
rules:
  - pre_assembly_jellyfishCount:
      local:
        - create_outdir: 1
        - outdir: '{$self->jellyfish_dir}'
        - INPUT:
            - '{$self->TR1}_1PE.fastq.gz'
            - '{$self->TR2}_2PE.fastq.gz'
        - OUTPUT: '{$self->jellyfish_dir}/{$sample}.jf'
        - HPC:
            - cpus_per_task: 24
            - walltime: 24:00:00
            - mem: 98GB
            - partition: serial
            - module: gencore gencore_dev anaconda/2-4.1.1
      process: |
        #TASK tags={$sample}
        mkdir -p {$self->outdir} \
        source activate /scratch/nd48/software/my_analysis/ && \
        jellyfish count \
            -m 21 \
            -s 100M \
            -t 24 \
            -C \
            -o {$self->OUTPUT} \
            <(zcat {$self->INPUT->[0]}) <(zcat {$self->INPUT->[1]})
  - pre_assembly_jellyfishHisto:
      local:
        - outdir: '{$self->jellyfish_dir}'
        - INPUT: '{$self->jellyfish_dir}/{$sample}.jf'
        - OUTPUT: '{$self->jellyfish_dir}/{$sample}.histo'
        - HPC:
            - cpus_per_task: 24
            - walltime: 04:00:00
            - mem: 30GB
            - partition: serial
            - module: gencore gencore_dev anaconda/2-4.1.1
            - deps: pre_assembly_jellyfishCount
      process: |
        #TASK tags={$sample}
        source activate /scratch/nd48/software/my_analysis/ && \
        jellyfish histo -t 24 {$self->INPUT} > {$self->OUTPUT}
nizardrou commented 7 years ago

So....

If I use a regex in the sample_rule it works, but the --samples flag does not.

jerowe commented 7 years ago

I'm creating a quick test environment in 'test'.

[gencore@login-0-4 test]$ :mkdir -p data/processed/Sample_163
[gencore@login-0-4 test]$ :mkdir -p data/processed/Sample_01
[gencore@login-0-4 test]$ :mkdir -p data/processed/Sample_160
[gencore@login-0-4 test]$ :mkdir -p data/processed/Sample_161
[gencore@login-0-4 test]$ :mkdir -p data/processed/Sample_162
  - sample_rule: (Sample.*)$
  - jellyfish_dir: 'data/JANALYSIS/{$sample}/jellyfish'

Running without the --sample works as is

[gencore@login-0-4 test]$ :tree data
data
├── JANALYSIS
│   ├── Sample_01
│   │   └── jellyfish
│   ├── Sample_160
│   │   └── jellyfish
│   ├── Sample_161
│   │   └── jellyfish
│   ├── Sample_162
│   │   └── jellyfish
│   └── Sample_163
│       └── jellyfish

But you're right, when supplying --samples the directories are not created.

jerowe commented 7 years ago

I have identified the bug, added in this workflow as a test case. It will be fixed in the next release.

jerowe commented 7 years ago

This is fixed, and the new version is on dalma. Closing this out.