m3dev / gokart

Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.
https://gokart.readthedocs.io/en/latest/
MIT License
305 stars 57 forks source link

Add requires to TaskInfo #250

Closed mski-iksm closed 3 years ago

mski-iksm commented 3 years ago

I've added requires attribute to TaskInfo, so that information of task.requires() will be dumped to tree_info_table.

This is useful when a task requires 2 same tasks but with different parameter. With key as index, you can get the unique_id of the two tasks separately.

class _TaskInfoExampleTaskA(gokart.TaskOnKart):
    def requires(self):
        return dict(task1=_TaskInfoExampleTaskA(param=1), task2=_TaskInfoExampleTaskA(param=2))

In the above example, the requires column of tree_info_table will have dictionary with same keys as return() (i.e. task1, task2) and object of name and unique_id (i.e. RequiredTask(name, unique_id) object) as values. In this way, we can distinguish between two tasks by key and get the unique_id.

Please review.

Hi-king commented 3 years ago

@mski-iksm Merged. Thx!