industrydive / fileflow

Airflow plugin to transfer arbitrary files between operators
http://fileflow.readthedocs.io/en/latest/
Apache License 2.0
78 stars 21 forks source link

DivePythonOperator AirflowException('`python_callable` param must be callable' #15

Open wongwill86 opened 7 years ago

wongwill86 commented 7 years ago

Hello, I was trying out the tutorial to learn more about fileflow and airflow and received this error when running the tutorial code:

Traceback (most recent call last):
  File "<stdin>", line 7, in <module>
  File "/usr/local/lib/python2.7/dist-packages/fileflow/operators/dive_python_operator.py", line 25, in __init__
    super(DivePythonOperator, self).__init__(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/fileflow/operators/dive_operator.py", line 23, in __init__
    super(DiveOperator, self).__init__(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/airflow/utils/decorators.py", line 86, in wrapper
    result = func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/airflow/operators/python_operator.py", line 68, in __init__
    raise AirflowException('`python_callable` param must be callable')

Looks like this line was added 3 months ago into airflow :

+        if not callable(python_callable):
 +            raise AirflowException('`python_callable` param must be callable')

Which seems to conflict with the DivePythonOperator https://github.com/industrydive/fileflow/blob/master/fileflow/operators/dive_python_operator.py#L23

Would love to suggest an edit but i'm not familiar enough with both libraries yet.

MiriamSexton commented 7 years ago

Hi,

That's interesting. Thanks for the heads up!

I'll investigate this a bit and try to have at least a branch that does a hacky work-around up by tonight.

wongwill86 commented 7 years ago

fwiw, if i hack kwargs['python_callable'] = lambda: None in DivePythonOperator::__init__ it circumvents the check altogether and looks like it runs ok. But i'm not sure if this is the "correct" thing to do there. Thanks!

lauralorenz commented 7 years ago

Just ran into this while testing an upgrade to our airflow installation to 1.8.2. In fileflow we hack the python_callable to None since our DivePythonOperator class doesn't use it (https://github.com/industrydive/fileflow/blob/master/fileflow/operators/dive_python_operator.py#L23), but airflow 1.8.1+ introduced an assertion in the superclass' PythonOperator.__init__ that will blow this up: https://github.com/apache/incubator-airflow/blob/32a26d84b679a54add43092d0bdb77350dcbaeaf/airflow/operators/python_operator.py#L64-L65. Since DivePythonOperator doesn't use it, we would probably do something similar to what @wongwill86 did and override python_callable to be literally anything that has a __call__ method (i.e. lambda: None) to bypass the irrelevant assertion.

lauralorenz commented 6 years ago

May be resolved in https://github.com/industrydive/fileflow/pull/17