scikit-learn / enhancement_proposals

Enhancement proposals for scikit-learn: structured discussions and rational for large additions and modifications
https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
47 stars 34 forks source link

SLEP006 on Sample Properties #16

Closed jnothman closed 4 years ago

adrinjalali commented 5 years ago

and (somehow) move slep004 to rejected?

adrinjalali commented 5 years ago

Could I somehow be of any help here @jnothman ?

SSaishruthi commented 5 years ago

Hi,

Is there any place that I can help with this SLEP to move forward so that we can accelerate AIF360 and scikit-learn integration?

The issue has been referenced above.

@animeshsingh

animeshsingh commented 5 years ago

@adrinjalali @jnothman any updates on this will be useful - something we need in the context of AIF360 work we are doing?

jnothman commented 5 years ago

There are lots of competing proposals and I will need to find some time to write them up.

jnothman commented 5 years ago

I consider each of the solutions here a family of solutions, rather than an entirely specific syntax. The way forward involves defining a possible syntax for each, then coding up each of the test cases for each solution.

jnothman commented 5 years ago

Apparently I was pushing to the wrong remote...

adrinjalali commented 5 years ago

Awesome, I really like solution 4.

NicolasHug commented 4 years ago

What's the status of this? Does it need more reviews? LMK if I can help in any way

jnothman commented 4 years ago

What's the status of this? Does it need more reviews? LMK if I can help in any way

It needs to go from conceptual approaches to example code of each use case... I'm unlikely to find time this month.

hermidalc commented 4 years ago

Also forgot in relation to my comment above for needing to pass test metadata, this is also related to your bullet point on needing to pass test sample_weight to scorers during CV. Looking at https://github.com/scikit-learn/scikit-learn/blob/5c9f0906102e4677b045744a24228b6c57a6c471/sklearn/model_selection/_validation.py#L490-L493 the information is there where the code to split **fit_params for train indices should also be done for test indices and then that passed to _score. Either way we need to pass test sample metadata through here so that scorer can pass it to all the predict-like methods.

Though again what do we call these... **predict_params or **transform_params? Unlike **fit_params the appropriate name isn’t obvious. Some test sample metadata might be used in transform only and some in predict only but all predict-like methods will need to pass them thru.

jnothman commented 4 years ago

I'm keen to push this soon towards vote, so that we can consider @adrinjalali's PR for v0.25 (2021Q2). Needs some review. Then I can review Successive Halving, and the world will be a better place. Are you with me, @adrinjalali and @hermidalc?

adrinjalali commented 4 years ago

Yeah I'm down.

adrinjalali commented 4 years ago

We haven't passed it, but we kinda sorta agreed that if the author and another maintainer are happy with it being merged in "draft" status. So I'm happy to merge and work on it on separate issues.

jnothman commented 4 years ago

Then let's merge before the monthly meeting?