thasso / pyjip

JIP Pipeline System
http://pyjip.readthedocs.org
Other
19 stars 8 forks source link

BWA pipeline example code is broken? #58

Closed d3d1y closed 9 years ago

d3d1y commented 9 years ago

I'm just starting out to use jip, but I really think that it's exactly what I need. I've been working through the very helpful examples and have no problem with dry run on modules at all. But when I try to dry run the example pipeline, I get this error:


File "/media/twoTBone/work/pileup.py", line 162, in pipeline
    align = p.run('bwa_align', input=self.input, reference=ref, output="${out}.sai")
  File "/usr/local/lib/python2.7/dist-packages/jip/pipelines.py", line 301, in run
    node.set(k, v, allow_stream=False)
...
File "/usr/local/lib/python2.7/dist-packages/jip/options.py", line 1064, in get_default_output
    raise LookupError("No default output option found")
LookupError: No default output option found

I don't understand why jip thinks that there's not a default output option for the bwa_align module. I'm using the module exactly as it is in the example. It only errors when I try to run the module through the pipeline:


    Options:
        -r, --reference   The genomic reference index
        -i, --input           The input reads
        -o, --output         The output file
                                     [default: stdout]

work$ JIP_MODULES=pileup.py jip run --dry bwa_align -i reads.txt -r ref.txt -o out.txt


#######################################################################################
|                                  Job - bwa_align                                    |
+--------------------------------+----------------------------------------------------+
|              Name              |                       Value                        |
+================================+====================================================+
| reference                      | ref.txt                                            |
| input                          | reads.txt                                          |
| output                         | out.txt                                            |
+--------------------------------+----------------------------------------------------+
#####################################################################################################################################################
|                                                                    Job states                                                                     |
+--------------------------------+--------+----------------------------------------------------+----------------------------------------------------+
|              Name              | State  |                       Inputs                       |                      Outputs                       |
+================================+========+====================================================+====================================================+
| bwa_align                      | Hold   | ref.txt, reads.txt                                 | out.txt                                            |
+--------------------------------+--------+----------------------------------------------------+----------------------------------------------------+

It's really discouraging that I can't even work through the example. I hope someone can tell me what I'm doing wrong.

d3d1y commented 9 years ago

I know that nothing much has been happening on this project for a while now, but I wanted to follow up to say that I have gotten jip working nicely using the bash script approach. I just submitted a bioinformatics pipeline job to my own mini-cluster. The jobs are clicking along, and I'm getting nice stout/sterr files. Thanks, Thasso!

thasso commented 9 years ago

Hm, that is interesting. Can you send the exact command that you are using to reproduce the issue? I am trying on both master and develop branches with this:

jip pileup.jip -r reference.fa -i reads.txt -o out.bcf -- --dry

and it works as expected (I am running this inside the examples/bwa folder).

Btw, its good to hear that it works in bash mode :)

d3d1y commented 9 years ago

Oh. How embarrassing. You are right. It works fine when I'm in the right folder. This is great!

I have been having a lot of fun using jip for pipeline development (and fun is not a word that I would normally use for this activity). I can't tell you how wonderful --dry --show is for me. I have a fair amount of experience in bash and python, but still, to be able to see exactly what calls are going to be made and to have that lovely dependency diagram to actually see what the pipeline is going to try to do before I have to submit it to my little cluster and have it blow up in my face is just ridiculously helpful.

I'm still prototyping this pipeline, so I've been happy to use bash for now for trying out different tools and architectures, but I'm looking forward to moving it over to python once I get it dialled in. I do have a question, though. I'm wondering what the best way to kill jobs is? I have noticed that after calling jip delete -j before the pipeline completely fails, that I have to hunt down the PIDs of dependent processes and kill them one by one. So I think I'm doing it wrong?

thasso commented 9 years ago

No problem, happy that it works :)

For the job delete. That sounds like a bug to me. Do you call jip cancel first on the job(s) and then delete? Or only delete? In any case, I will close this issue and would appreciate if you can create another one for the delete problem with a bit more context. Sounds like a bug to me. I was mostly using JIP on a slurm cluster so it might also be related to the grid engine.