cgat-developers / ruffus

CGAT-ruffus is a lightweight python module for running computational pipelines
MIT License
173 stars 34 forks source link

ruffus sometimes throws exceptions in RethrownJobError #65

Closed jbarlow83 closed 8 years ago

jbarlow83 commented 8 years ago

It appears that in some error paths cases the arguments of a RethrownJobError will be set to a list of five strings, rather than a list of tuples of five strings, as expected. That causes the exception below:

--- Logging error ---
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/logging/__init__.py", line 980, in emit
    msg = self.format(record)
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/logging/__init__.py", line 830, in format
    return fmt.format(record)
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/logging/__init__.py", line 567, in format
    record.message = record.getMessage()
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/logging/__init__.py", line 328, in getMessage
    msg = str(self.msg)
  File "/Users/jb/Documents/src/OCRmyPDF-dev/venv-3.5/lib/python3.5/site-packages/ruffus/ruffus_exceptions.py", line 127, in __str__
    message += self.get_nth_exception_str (ii)
  File "/Users/jb/Documents/src/OCRmyPDF-dev/venv-3.5/lib/python3.5/site-packages/ruffus/ruffus_exceptions.py", line 116, in get_nth_exception_str
    task_name, job_name, exception_name, exception_value, exception_stack = self.args[nn]
ValueError: too many values to unpack (expected 5)

Here len(self.args) == 5 and self.args = ['task_name', 'job_name', ...], so that self.args[nn] == self.args[0] == 'task_name", causing the ValueError.

bunbun commented 8 years ago

Thanks for flagging this. This is indeed a serious problem. The last thing you need in the middle of throwing is the library itself messing up, and loosing all your errors. Is this a reproducible problem? I am afraid just eyeballing the code, I can't immediately see the offending bug. I see that I use a tuple of 5 arguments in some places, and a list of 5 arguments in others (my bad) and RethrownJobError.append tries to concatenates two tuples rather than extend a list. However, I can't see where I am missing a set of parenthesis. Some help need :(

jbarlow83 commented 8 years ago

I also looked at the code before submitting and couldn't see anything obvious. I did some runtime tests and figured it out: Exception.args is a property, not a variable, the property setter forces anything assigned to Exception.args to be a tuple.

In [1]: ex = Exception()

In [2]: ex.args
Out[2]: ()

In [3]: ex.args = ['list', 'of', 'things']

In [4]: ex.args
Out[4]: ('list', 'of', 'things')

In [5]: Exception.args
Out[5]: <attribute 'args' of 'BaseException' objects>

You can't do self.args = tuple(list(job_exceptions)) because tuple() will iterate through the list. The syntax self.args = (list(job_exceptions),) does work, along with self.args[0].append(job_exception) to append to the list.

But perhaps it's best to create a different variable to track the list of exceptions, one that isn't managed by a base class.

jbarlow83 commented 8 years ago

I can consistently reproduce it in my program (https://github.com/jbarlow83/OCRmyPDF) with a certain file as input that causes an unrelated exception. I'm not sure why this rather ordinary looking AttributeError causes trouble for ruffus while other exceptions in my test suite don't.

chriscohoat commented 8 years ago

@jbarlow83 I'm consistently getting the RethrownJobError in the latest version (4.0.7) on debian:stretch. I can't seem to get any OCR to function. Is there anything I can pitch in on in fixing this?

jbarlow83 commented 8 years ago

@chriscohoat A possible workaround is here: https://github.com/jbarlow83/OCRmyPDF/issues/61. If that doesn't do it in your case I will investigate further.

bunbun commented 8 years ago

Sorry about that. Will try and get a patched release out Monday or Tuesday. Given that someone else has done all the heavy lifting tracking down the bug (thanks!) this should be relatively straightforward

Thanks Leo On 3 Apr 2016 2:43 p.m., "jbarlow83" notifications@github.com wrote:

@chriscohoat https://github.com/chriscohoat A possible workaround is here: jbarlow83/OCRmyPDF#61 https://github.com/jbarlow83/OCRmyPDF/issues/61. If that doesn't do it in your case I will investigate further.

— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub https://github.com/bunbun/ruffus/issues/65#issuecomment-204889312

bunbun commented 8 years ago

I have a patched release where RethrownJobError no longer inherites its implementation details from Exception. However, I am very loathe to release any new code without a unit test . If you have a better idea as to what caused the error, Is it possible to create a minimal test case. I am still having serious difficulty understanding what triggers this bug: python 3.4? threading problems? etc. Leo

Dr. Leo Goodstadt University of Oxford United Kingdom

On 3 April 2016 at 08:13, Leo Goodstadt bunbun68@gmail.com wrote:

Sorry about that. Will try and get a patched release out Monday or Tuesday. Given that someone else has done all the heavy lifting tracking down the bug (thanks!) this should be relatively straightforward

Thanks Leo On 3 Apr 2016 2:43 p.m., "jbarlow83" notifications@github.com wrote:

@chriscohoat https://github.com/chriscohoat A possible workaround is here: jbarlow83/OCRmyPDF#61 https://github.com/jbarlow83/OCRmyPDF/issues/61. If that doesn't do it in your case I will investigate further.

— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub https://github.com/bunbun/ruffus/issues/65#issuecomment-204889312

jbarlow83 commented 8 years ago

see #67