pallassgj / bpipe

Automatically exported from code.google.com/p/bpipe
0 stars 1 forks source link

struggling with input/output files #70

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,
I am trying to contruct a pipeline for genetic variant calling with Bpipe

### START FILE

NOVOALIGN="/home/exome/bin/novoalign"
REF_HUMAN="/home/pedrosoi/repository/ref_genomes/GRCh37"
SAMTOOLS="home/exome/bin/samtools"
JAVA="/home/exome/bin/java"
PICARD_FOLDER = "/apps/picardtools/1.74/"

novoalign = {
    from("fastq.gz"){
        transform("stats","bam"){
            exec = "echo $NOVOALIGN -c 7 -d $REF_HUMAN -f $input.gz $input.gz --Q2Off -F STDFQ -i 200 30 -o SAM -o SoftClip -k -a -g 65 -x 7 2> $output.stats | $SAMTOOLS view -bS - > $output.bam "
            forward output.bam
        }
    }
}

sort_bam = {
        filter("sorted"){
            exec= "echo $JAVA -Xmx20g -jar $PICARD_FOLDER/SortSam.jar INPUT=$input OUTPUT=$output SORT_ORDER=coordinate TMP_DIR=tmp1 VALIDATION_STRINGENCY=SILENT "
        }
}

dedup = {
    produce("novoalign.bam","metrics.out") {
    exec = "echo $JAVA -Xmx20g -jar $PICARD_FOLDER/MarkDuplicates.jar INPUT=$input.bam OUTPUT=$output VALIDATION_STRINGENCY=SILENT METRICS_FILE=$output TMP_DIR=tmp2 REMOVE_DUPLICATES=TRUE"
    }
}

Bpipe.run { novoalign + sort_bam + dedup}

######## END FILE

when I run it with:

$ bpipe test alignment.pipe GHCA0001_1_all.fastq.gz GHCA0001_2_all.fastq.gz 
================================================================================
====================
|                              Starting Pipeline at 2013-02-11 21:31            
                   |
================================================================================
====================

========================================= Stage novoalign 
==========================================
Pipeline failed! (2) 

Expected output file GHCA0001_1_all.fastq.bam could not be found

======================================== Pipeline Finished 
=========================================
21:31:34 MSG:  Finished at Mon Feb 11 21:31:34 GMT 2013

---------

I am ingeneral strugling to get inputs and outputs to work properly. Perhaps if 
you help me with this example I can start to work it out.
Thanks in advance.

What version of the product are you using? On what operating system?
bpipe-0.9.8_beta_3, centos

Please provide any additional information below.

Original issue reported on code.google.com by intipedr...@gmail.com on 11 Feb 2013 at 9:33

GoogleCodeExporter commented 9 years ago
I have a similar issue also.  When the input is fastq.gz, the output file 
determined by bpipe will just strip the .gz and replace with .bam or .sai etc.  
So the expected output file will be .fastq.bam instead of just .bam, and then 
the file will not be found for the next stage.

Is there a way around this?

Original comment by sgrimes1...@gmail.com on 20 Mar 2013 at 4:31