Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Apache License 2.0
17.71k
stars
2.39k
forks
source link
Bug: parameter doesn't preserve default None value after yield #3226
When you have a task with a parameter that defaults to None, it converts to str("None") if the task is yielded from run(), but remains None if a task is returned from requires() method.
When I run the following code with luigi --module <module> YieldTask --local-scheduler, the file yield.txt contains <class 'str'>.
When I run the following code with luigi --module <module> ReturnTask --local-scheduler, the file return.txt contains <class 'NoneType'>.
import luigi
class ParamTask(luigi.Task):
none_param = luigi.Parameter(default=None)
name_param = luigi.Parameter()
def run(self):
with self.output().open('w') as f:
f.write(str(type(self.none_param)))
def output(self):
return luigi.LocalTarget(f'{self.name_param}.txt')
class YieldTask(luigi.Task):
def run(self):
yield ParamTask(name_param='yield')
class ReturnTask(luigi.Task):
def requires(self):
return ParamTask(name_param='return')
When you have a task with a parameter that defaults to
None
, it converts tostr("None")
if the task is yielded fromrun()
, but remainsNone
if a task is returned fromrequires()
method.When I run the following code with
luigi --module <module> YieldTask --local-scheduler
, the fileyield.txt
contains<class 'str'>
. When I run the following code withluigi --module <module> ReturnTask --local-scheduler
, the filereturn.txt
contains<class 'NoneType'>
.luigi version: 3.2.0