spotify / luigi

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Apache License 2.0
17.71k stars 2.39k forks source link

Bug: parameter doesn't preserve default None value after yield #3226

Open natalevichmv opened 1 year ago

natalevichmv commented 1 year ago

When you have a task with a parameter that defaults to None, it converts to str("None") if the task is yielded from run(), but remains None if a task is returned from requires() method.

When I run the following code with luigi --module <module> YieldTask --local-scheduler, the file yield.txt contains <class 'str'>. When I run the following code with luigi --module <module> ReturnTask --local-scheduler, the file return.txt contains <class 'NoneType'>.

import luigi

class ParamTask(luigi.Task):
    none_param = luigi.Parameter(default=None)
    name_param = luigi.Parameter()

    def run(self):
        with self.output().open('w') as f:
            f.write(str(type(self.none_param)))

    def output(self):
        return luigi.LocalTarget(f'{self.name_param}.txt')

class YieldTask(luigi.Task):
    def run(self):
        yield ParamTask(name_param='yield')

class ReturnTask(luigi.Task):
    def requires(self):
        return ParamTask(name_param='return')

luigi version: 3.2.0