kubeflow / code-intelligence

ML-Powered Developer Tools, using Kubeflow
https://medium.com/kubeflow/reducing-maintainer-toil-on-kubeflow-with-github-actions-and-machine-learning-f8568374daa1?source=friends_link&sk=ac77444f00c230e7d787edbfb0081918
MIT License
55 stars 21 forks source link

[label bot] AutoML need to handle "/" in labels better #136

Open jlewi opened 4 years ago

jlewi commented 4 years ago

AutoML models don't allow "/" in label names. I initially handled this by just replacing "/" with a "-".

But this is problematic because some of our labels have "-" in them e.g.

"area/front-end"

Inverting the mapping therefore becomes problematic e.g

area/front-end -> area-front-end

To map area-front-end back to the original value we can't just replace all "-" with "/".

As a temporary work around I only replaced the first "-" which works because the only labels we are predicting right now have an area or platform.

I think a better solution would be to do multi-character replacement that never occurs. So we could map e.g. "/" to "-_" (i.e. a dash and an underscore)

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label Probability
area/front-end 0.51

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

wjayesh commented 4 years ago

Hello! I am new to contributing to this repository. Can I submit a PR for this issue? I think I have understood the problem and identified the code that needs to be modified to solve it. Thanks :)

GauravSarkar commented 4 years ago

I looked at the issue and it seemed I can solve it, but it would be a great help if I can get little bit more clarification on what I have to do exactly. @jlewi little help can help me get started on this.

GauravSarkar commented 4 years ago

@jlewi I sent a PR but no action has been taken regarding that. Can You please kindly look into it and merge it.

jlewi commented 4 years ago

@GauravSarkar thanks!

in general it is a good idea to discuss what you are planning and get buy in before investing time and doing the work.

I think this is more involved then just updating the notebook #188

The training and serving code both need to be updated to handle the mapping the same way; otherwise we will have a problem.

This in turn means we need to do a coordinated rollout; i.e. rollout the new serving code with a new model.

It would probably be good to start by identifying

I'm not sure this is the best project to tackle right now as given the level of complexity and my availability I'm not sure how much progress you will be able to make.

Are you able to deploy and run your own instance of label bot? Do you have access to GCP and AutoML? If not then I think you might just get frustrated trying to make progress on this.

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label Probability
area/jupyter 0.71

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

GauravSarkar commented 4 years ago

I was thinking the same that if I make changes at one place it won't work. I have no access to gcp, so I think I won't be able to progress much.

jlewi commented 4 years ago

@GauravSarkar do you have access to notebooks? Perhaps you could help out with issues like kubeflow/community#439 regarding creating reports about Kubeflow community health?