m3dev / gokart

Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.
https://gokart.readthedocs.io/en/latest/
MIT License
305 stars 57 forks source link

self.clone(cls, rerun=True) don't work #219

Closed Hi-king closed 3 years ago

Hi-king commented 3 years ago
class A(gokart.TaskOnKart):
    a = luigi.Parameter(default="a")
    def run(self):
        print(self.__class__)
        print(f"rerun: {self.rerun}")
        print(f"a: {self.a}")
        print(f"fail_on_empty_dump: {self.fail_on_empty_dump}")
        self.dump(pd.DataFrame([1]))

class B(A):
    pass

class C(A):
    pass

class D(A):
    def requires(self):
        return [
            A(rerun=False),
            B(),
            self.clone(C),
        ]
gokart.build(D(rerun=True, a="b", fail_on_empty_dump=True), verbose=True)
<class '__main__.C'>
rerun: False
a: b
fail_on_empty_dump: True
<class '__main__.A'>
rerun: False
a: a
fail_on_empty_dump: False
<class '__main__.B'>
rerun: False
a: a
fail_on_empty_dump: False
<class '__main__.D'>
rerun: True
a: b
fail_on_empty_dump: True
Hi-king commented 3 years ago

https://github.com/m3dev/gokart/blob/master/gokart/task.py#L136

I know rerun is a special case of .clone, but I think kwargs shouldn't be ignored