rescalante-lilly / ruffus

Automatically exported from code.google.com/p/ruffus
MIT License
0 stars 0 forks source link

@files decorator with task argument requires output parameter #46

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Write task1 with @files(None,"output.txt")
2. Write task2 with @files(task1)
3. Run pipeline

What is the expected output? What do you see instead?
task2 has no output and only expects incoming output filename from task1.  
Script complains that task1() expects 2 arguments and none given.

What version of the product are you using? On what operating system?

Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53) 
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import ruffus
>>> ruffus.__version__
'2.2'

--(1412:Mon,01 Aug 11)-- uname -a
Linux servername 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 
x86_64 x86_64 x86_64 GNU/Linux

Please provide any additional information below.
Here's my test code:

    @files(None,"task1.out")
    def task1(infile,outfile) :
        print "task1"
        open(outfile,'w')

    @files(task1)
    def task2(infile) :
        print infile

I get this:

RethrownJobError: 

    Exceptions generating parameters for

    'def task2(...):'

    Original exception:

    Exception #1
    exceptions.TypeError(task1() takes exactly 2 arguments (0 given)):
    for __main__.task2.

    Traceback (most recent call last):
      File "/mit/labadorf/.local/lib/python2.7/site-packages/ruffus-2.2-py2.7.egg/ruffus/task.py", line 921, in signal
        for param, descriptive_param in self.param_generator_func(runtime_data):
      File "/mit/labadorf/.local/lib/python2.7/site-packages/ruffus-2.2-py2.7.egg/ruffus/file_name_parameters.py", line 475, in iterator
        for params in generator():
    TypeError: task1() takes exactly 2 arguments (0 given)

If I change task2 as follows:

    @files(task1,"dummy")
    def task2(infile,dummy) :
        print infile

it works fine.  task2 doesn't actually have any output, so it would be nice not 
to have to put in these dummy arguments.

Excellent package - I'm really looking forward to using it.

Original issue reported on code.google.com by alabad...@gmail.com on 1 Aug 2011 at 6:15

GoogleCodeExporter commented 9 years ago
Oh, also, if I try:

    @files(task1,None)
    def task2(infile) :
        print infile

error_task_files:     

    Input or output file parameters should contain at least one or more file names strings.[<function task1 at 0x205e758>, None] for

    'def task2(...):'

Original comment by alabad...@gmail.com on 1 Aug 2011 at 6:18

GoogleCodeExporter commented 9 years ago
Not sure what you are trying to do but the attached code is what I would expect.

AFAIK, the @files decorator does not take task arguments.

@files takes either input/output parameters (usually file names) 
(http://www.ruffus.org.uk/decorators/files.html)
or a generator which yields the appropriate parameters. 
http://www.ruffus.org.uk/decorators/files_ex.html

The latter explains your error message.
To have "task" dependencies, use @transform, @split, @merge etc.

Please see the tutorial or manual at www.ruffus.org.uk or post to the 
ruffus_discuss newsgroup (http://groups.google.com/group/ruffus_discuss) for 
help.

Leo

from ruffus import *
@files(None,"task1.out")
def task1(infile,outfile) :
        print infile
        open(outfile,'w')

# empty suffix because we are not generating an output
@transform(task1, suffix(""), None)
def task2(infile, outfile):
        print infile

# See what is going to happen
import sys
pipeline_printout(sys.stdout, [task2], verbose = 3)

# run pipeline
pipeline_run([task2])

Original comment by bunbu...@gmail.com on 5 Oct 2011 at 12:50

GoogleCodeExporter commented 9 years ago
Thanks for the response.  The documentation I followed for supplying tasks to 
the @files decorator is here:

http://www.ruffus.org.uk/tutorials/manual/tasks_and_globs_in_inputs.html

The first topic describes how to implicitly chain task output/input files 
together, is this documentation out of date?  Passing tasks to the @files 
decorator does work as described.  In any case, my issue was only about minor 
inconveniences - when a task is provided as the input to a step and there is no 
output, I have to provide a placeholder string that means nothing as the output 
parameter to @files and also include a placeholder formal argument to the task 
method. I admittedly have not fully explored the @transform decorator, so it 
may be that that's what I should be using instead of @files.  The second small 
inconvenience is that in this situation, if a task is provided as a first 
argument and None is the second, which I think should be legal based on the 
documentation, the package complains that at least one of the input/output 
filename arguments must be a string.  Very minor things.

I've been using ruffus quite a lot since this post, I really like the package.  
Thanks!

Original comment by alabad...@gmail.com on 5 Oct 2011 at 3:48

GoogleCodeExporter commented 9 years ago
I apologise. You are indeed right.

Nevertheless... Much of the point to @files is that ruffus will check if
your task or your jobs are up to date by looking at the modification times.

If there is no output, then task2 will always run, using the inputs from task1.

In which case, you might consider 
@parallel(task1, None)

Original comment by bunbu...@gmail.com on 5 Oct 2011 at 11:30