m3dev / gokart

Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.
https://gokart.readthedocs.io/en/latest/
MIT License
318 stars 57 forks source link

Jupyter Notebooks causes unique_id caching #210

Closed TaylerUva closed 3 years ago

TaylerUva commented 3 years ago

When using Juptyer Notebooks in combination with gokart.build, I found that self.task_unique_id gets cached unless I restart the kernel or drop the task class from memory.

This is an issue because if a task's dependencies' params are updated without clearing memory, the task will not generate a new task_unique_id (hash_id), meaning it won't run with the changed params.

https://github.com/m3dev/gokart/blob/575207f5d348b8c1afbb6d1af3777c67230e15ed/gokart/task.py#L261-L263

I got around this by forcing the hash_id to be generated each time, but I do not believe this to be ideal.

def make_unique_id(self):    
    return self._make_hash_id()