ssadedin / bpipe

Bpipe - a tool for running and managing bioinformatics pipelines
http://docs.bpipe.org/
Other
228 stars 57 forks source link

Skip command start for "my command" due to probe mode #284

Open L-of-IOS opened 1 year ago

L-of-IOS commented 1 year ago

I can't find any information related to that "probe mode", and the probe mode seems to ruin my pipeline. I hope someone could find me a fix or tell me where might I get wrong.

Here is the pipeline I'm tring to use

cutadapt="~/.local/bin/cutadapt"

QCbycutadapt = { def parts = input.split("/") def sample = parts[parts.length - 1] def file1 = input+"/"+sample+"_1.clean.fq.gz" def file2 = input+"/"+sample+"_2.clean.fq.gz"

produce(input+"/"+sample+"_1.clean.fq.qc.gz", input+"/"+sample+"_2.clean.fq.qc.gz") { exec """ $cutadapt -q 20 -m 20 -o $output1 -p $output2 $file1 $file2
""" } }

run {QCbycutadapt }

bpipe.PipelineContext [1] INFO |8:33:50 Skip check dependencies due to probe mode bpipe.PipelineContext [1] INFO |8:33:50 Skip command start for ~/.local/bin/cutadapt -q 20 -m 20 -o K01_1.clean.fq.qc.gz -p K01_2.clean.fq.qc.gz 00.CleanData/K01/K01_1.clean.fq.gz 00.CleanData/K01/K01_2.clean.fq.gz due to probe mode

ssadedin commented 1 year ago

Hi @L-of-IOS - the "probe mode" is a purely internal part of how bpipe works. When it is given an "exec" statement, before it really runs it, it "pretend" executes it. This is needed because some information is resolved lazily and is only known at the last minute when the command is about to run. So the "pretend" execution resolves all the final settings which are validated etc.

If you're having a problem with the stage I'd suspect its due to something else - I can see a few odd aspects though it might be just how you've chosen to do things. For example,

produce(input+"/"+sample+"_1.clean.fq.qc.gz", input+"/"+sample+"_2.clean.fq.qc.gz")

is using the input file name to construct the output file name, and treating input as a directory. If you want to put things into a subdirectory it would be more conventional to set the output.dir variable, which you can do based on the input file name if you want, eg:

output.dir = file(input).name
produce(sample+"_1.clean.fq.qc.gz",sample+"_2.clean.fq.qc.gz") {
...
}