m3dev / gokart

Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.
https://gokart.readthedocs.io/en/latest/
MIT License
305 stars 57 forks source link

Add options to check task definition to create an output #205

Closed hiro-o918 closed 3 years ago

hiro-o918 commented 3 years ago

What

Add a feature to create a hash of a task. The hash is created by cloudpickle.dumps to the class object of a task.

Why

Current implementation assumes that an output is determined by input parameters and requirements. But this assumption does not work, when the implementation of a task changes.

As a results, some issues can occur.

So, adding task definition to hash resolves these issues and re-reruns tasks safely.

Discussion

vaaaaanquish commented 3 years ago

@hiro-o918 Thanks great suggestions. Is it possible to apply a yapf formatter?

vaaaaanquish commented 3 years ago

LGTM. I'll wait for the @Hi-king review.

mski-iksm commented 3 years ago

@hiro-o918 LGTM! Thank you for your contribution!

Hi-king commented 3 years ago

Cool solution! For workaround, currently we're using __version=IntParameter(default=2) for declaring the change of task definition 💦 This way looks more sophisticated to me !

hiro-o918 commented 3 years ago

Thank you for your fast your reviews! :rocket:

hirosassa commented 3 years ago

@hiro-o918 gokart 1.0.1 including your great contribution is released. Thanks again!