kubeflow / code-intelligence

ML-Powered Developer Tools, using Kubeflow
https://medium.com/kubeflow/reducing-maintainer-toil-on-kubeflow-with-github-actions-and-machine-learning-f8568374daa1?source=friends_link&sk=ac77444f00c230e7d787edbfb0081918
MIT License
55 stars 21 forks source link

[Label Bot Continuous Training] Needs Training Needs to take into account whether there is a model currently being trained #178

Open jlewi opened 4 years ago

jlewi commented 4 years ago

Our synchronous training pipeline is currently spawning multiple instances of training rather than the expected 1 model per hour.

The problem appears to be the code to decide whether to train a model only looks at whether there is a trained model. So I don't think we take into account whether a model is currently being trained. https://github.com/kubeflow/code-intelligence/blob/faeb65757214ac93259f417b81e9e2fedafaebda/Label_Microservice/go/cmd/automl/pkg/automl/automl.go#L101

My conjecture is the following happens

At this point

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label Probability
kind/bug 0.63

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

jlewi commented 4 years ago

It looks like we need to also look at the datasets and see if there is a model training in progress.

jlewi commented 4 years ago

182 auto PR created for a model trained by manually running the notebook.

Need to verify that a new model is trained automatically and then deployed.

jlewi commented 4 years ago

kubeflow/code-intelligence#184 opened a PR to update to the same model. It doesn't look like a new model got trained.