m3dev / gokart

Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.
https://gokart.readthedocs.io/en/latest/
MIT License
305 stars 57 forks source link

Redundant warning messages about redis_host, redis_port. #166

Closed skokado closed 3 years ago

skokado commented 3 years ago

Hi, thank you for this project.

I faced displaying warning messages when running tasks after updating gokart recently.

tasks.py (almost do nothing task)

import gokart
import pandas as pd

class MyTask(gokart.TaskOnKart):
    def output(self):
        return self.make_target('output.csv')

    def run(self):
        df = pd.DataFrame()
        self.dump(df)
$ python3 -m luigi --local-scheduler --module tasks MyTask --log-level=INFO
.../site-packages/luigi/parameter.py:279: UserWarning: Parameter "redis_host" with value "None" is not of type string.
  warnings.warn('Parameter "{}" with value "{}" is not of type string.'.format(param_name, param_value))
.../site-packages/luigi/parameter.py:279: UserWarning: Parameter "redis_port" with value "None" is not of type string.
  warnings.warn('Parameter "{}" with value "{}" is not of type string.'.format(param_name, param_value))
INFO: Informed scheduler that task   MyTask__99914b932b   has status   PENDING
INFO: Done scheduling tasks
INFO: Running Worker with 1 processes
...
This progress looks :) because there were no failed tasks or missing dependencies

===== Luigi Execution Summary =====

This is caused by using luigi.Paremeter class for redis_host andredis_port, and default value is None.

I think, it is better to use luigi.OptionalParamter when default value is None instead of luigi.Parameter.

In the implementation of the displaying warning, luigi.OptionalParamter has a check for the value being None, but luigi.Paramter does not.

https://github.com/spotify/luigi/blob/7b7ff901480d9a02402037e0cb54e4508b5a94a2/luigi/parameter.py#L338-L343

https://github.com/spotify/luigi/blob/7b7ff901480d9a02402037e0cb54e4508b5a94a2/luigi/parameter.py#L275-L279

skokado commented 3 years ago

I have confirmed that the warning doesn't display by using OptionalParameter as below.

import luigi
import gokart
import pandas as pd

class BaseTaskOnKart(gokart.TaskOnKart):
    # override
    redis_host = luigi.OptionalParameter(default=None, description='Task lock check is deactivated, when None.', significant=False)
    redis_port = luigi.OptionalParameter(default=None, description='Task lock check is deactivated, when None.', significant=False)

class MyTask(BaseTaskOnKart):

    def output(self):
        return self.make_target('output.csv')

    def run(self):
        df = pd.DataFrame()
        self.dump(df)
mski-iksm commented 3 years ago

@skokado Thank you for your suggestion.

We will fix this issue in the next release!