scikit-learn-contrib / metric-learn

Metric learning algorithms in Python
http://contrib.scikit-learn.org/metric-learn/
MIT License
1.39k stars 233 forks source link

Allow support for multi-label algorithms #174

Open wdevazelhes opened 5 years ago

wdevazelhes commented 5 years ago

Multi-labels problems with a lot of labels are a good use case of metric learning, so we could add support for it in the algorithms. In supervised ones it would mean modifying the loss function a bit (we have been discussing it with @bellet for NCA's PR in scikit-learn for instance) For weakly supervised ones it would mean make tuples from multi-labeled data (it seems that there are several strategies to do so, like how much labels do points share, etc...)

terrytangyuan commented 5 years ago

Could you share the link to NCA PR in scikit-learn? Are you reusing what’s available in metric-learn?

wdevazelhes commented 5 years ago

Could you share the link to NCA PR in scikit-learn?

Sure, here is the link: https://github.com/scikit-learn/scikit-learn/pull/10058

Are you reusing what’s available in metric-learn?

In fact I reused a lot of a PR about LMNN (https://github.com/scikit-learn/scikit-learn/pull/8602) for the architecture of the code, and just replaced the function with NCA's loss function. This PR is quite developed with respect to the error messages, the checks of the parameters, the automatic initialization, etc, so I guess we could get some of the developments from this PR in metric-learn (that's already what we did in some PRs like #113, #105, and #99)

wdevazelhes commented 5 years ago

Btw the PR has been merged in scikit-learn recently ! :tada:

terrytangyuan commented 5 years ago

Nice job! Congrats!

wdevazelhes commented 5 years ago

Thanks !

bellet commented 5 years ago

Also to mention that it was @GaelVaroquaux who originally suggested to investigate the multi-label setting ;-)

angelotc commented 3 years ago

Looking forward to this development if it is still a thing