scikit-adaptation / skada

Domain adaptation toolbox compatible with scikit-learn and pytorch
https://scikit-adaptation.github.io/
BSD 3-Clause "New" or "Revised" License
60 stars 16 forks source link

Selector to avoid filtering out masked samples when fitting transformer #129

Closed kachayev closed 7 months ago

kachayev commented 7 months ago

This is an addition to the functionality implemented in #123.

The question here is the following:

pipe = make_da_pipeline(StandardScaler(), SubspaceAlignmentAdapter(), LogisticRegression())
pipe.fit(X=X_train, y=y_train, sample_domain=sample_domain)

Assuming y_train is properly masked. Now

This PR makes it so StandardScaler gets both sources and targets, as fit does not require labels. It previously worked this way, and it seems like this is a much stronger default. For non default behavior, we still can wrap the transformer into a proper selector when those are ready (see #116).

Let me know WDYT.

codecov[bot] commented 7 months ago

Codecov Report

Attention: Patch coverage is 95.83333% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 92.02%. Comparing base (ba065a8) to head (2a9774f).

Files Patch % Lines
skada/tests/test_selector.py 94.44% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #129 +/- ## ========================================== + Coverage 83.07% 92.02% +8.95% ========================================== Files 43 43 Lines 3485 3500 +15 ========================================== + Hits 2895 3221 +326 + Misses 590 279 -311 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.