pidefrem commented 6 months ago

Description

Continuing the work done in PR Allow the use of GroupKFold cv-split (see https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GroupKFold.html) and also the use of custom cv-splits based on StratifiedKFold for example (see https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.StratifiedKFold.html#sklearn.model_selection.StratifiedKFold)

Fixes #202

Type of change

New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

[x] test_regression.test_results_with_constant_groups
[x] test_regression.test_results_with_groups
[x] test_classification. test_results_with_constant_groups
[x] test_classification. test_results_with_groups

Checklist

[X] I have read the contributing guidelines
[X] I have updated the HISTORY.rst and AUTHORS.rst files
[X] Linting passes successfully : make lint
[X] Typing passes successfully : make type-check
[X] Unit tests pass successfully : make tests
[X] Coverage is 100% : make coverage
[X] Documentation builds successfully : make doc

thibaultcordier commented 6 months ago

Hello @pidefrem ! Thank you for this contribution. I'll take the time to review it. Just to be in line with your proposal, I understand that in Allow MapieRegressor to use group split strategy, the split strategy refers to the way of doing cross-validation but that it is a cross conformal method.

Don't hesitate to contact me if you need any help.

pidefrem commented 6 months ago

Hello @thibaultcordier, yes it refers to the split methods of the cross validator used during the fit of the MAPIE estimators. Please feel free to suggest any other description that you think is more suitable.

codecov-commenter commented 6 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (614293e) 100.00% compared to head (b9f25fa) 100.00%. Report is 210 commits behind head on master.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #393 +/- ## ========================================== Coverage 100.00% 100.00% ========================================== Files 39 39 Lines 4616 4887 +271 Branches 487 803 +316 ========================================== + Hits 4616 4887 +271 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

pidefrem commented 5 months ago

Wonderful! Thank you for this very welcome contribution. I don't have many comments to add but just a few to complete your proposal exhaustively:

some style code suggestions

As far as the tests are concerned, you are indeed testing whether MAPIE returns the same results using the same group (constant or None). But what happens if you use different groups like np.concat([np.ones(shape=n_samples/2), 2*np.ones(shape=n_samples/2)]) (untested proposal).

duplicate the tests in the classification test file because changes have been made in this part of the code.

@thibaultcordier I fixed some issues and added some tests, tell me if it is ok now.

thibaultcordier commented 5 months ago

Wonderful! Thank you for this very welcome contribution. I don't have many comments to add but just a few to complete your proposal exhaustively:

some style code suggestions

As far as the tests are concerned, you are indeed testing whether MAPIE returns the same results using the same group (constant or None). But what happens if you use different groups like np.concat([np.ones(shape=n_samples/2), 2*np.ones(shape=n_samples/2)]) (untested proposal).

duplicate the tests in the classification test file because changes have been made in this part of the code.

@thibaultcordier I fixed some issues and added some tests, tell me if it is ok now.

Hello @pidefrem, I'll check your changes this week. Thank you for contacting me about the review. I'll keep you informed.

scikit-learn-contrib / MAPIE

Allow `MapieRegressor` to use K-fold iterator variants with stratification and groups. #393

Description

Type of change

How Has This Been Tested?

Checklist

Codecov Report