The original idea was to get ability to propagate labels, though it turned out there are a few other things we need to take care about. The list of changes:
Each selector converts input into fit_transform into MetadataContainer to carry information about features, labels and additional named parameters (metadata) (if the selector receives container it uses the one from the input).
Each Selector is responsible for unpacking it before passing arguments to the underlying estimator(s) and merging the result(s) back into the container.
adapt is not longer a part of BaseAdapter interface, use fit_transform (adaptation during fit-time) and transform (predict-time) instead. By default, BaseAdapter just pass samples through (without changing them). You can still change this behavior for a given adapter, if you feel like transformation makes sense on it own (e.g. this would be true for subspace methods).
Instead of AdaptationOutput adapter is suppose to return either a) samples X, b) pair X, params with dictionary of additional keys (e.g. sample_weight), c) triple X, y, params.
There are a few things I need to take care about:
[x] Restore functionality of IncompatibleMetadataError
[x] Update all adapters to avoid adapt
[x] Re-write tests for adaptation output propagation
[x] Fix tests for MMD reweight adapter
[x] Reweight adapters to have transform that performs adapt but discards weights
[ ] Properly update documentation regarding the change in the API
[ ] Debug low score for CORAL mapping tests
[ ] In plot_subspace we have quality of SubspaceAlignment lower than no adaptation
Could be done later:
[x] Make PerDomain work
[x] Make SelectSource work
[x] Make SelectTarget work
[x] Make SelectSourceTarget work
[x] Remove AdaptationOutput (maybe a separate task)
[x] Add test with TransformerMixin
[ ] Properly deal with metadata routing parameter renaming (maybe a separate task)
[ ] Make sure that container is serializable for joblib/Memory settings for the pipeline
[ ] Cover situation where adapter output is consumed by non-final estimator
[ ] Think about better API for selectors (specifically around fit, fit_transform, etc)
The original idea was to get ability to propagate labels, though it turned out there are a few other things we need to take care about. The list of changes:
fit_transform
intoMetadataContainer
to carry information about features, labels and additional named parameters (metadata) (if the selector receives container it uses the one from the input).adapt
is not longer a part ofBaseAdapter
interface, usefit_transform
(adaptation during fit-time) andtransform
(predict-time) instead. By default,BaseAdapter
just pass samples through (without changing them). You can still change this behavior for a given adapter, if you feel like transformation makes sense on it own (e.g. this would be true for subspace methods).AdaptationOutput
adapter is suppose to return either a) samplesX
, b) pairX, params
with dictionary of additional keys (e.g.sample_weight
), c) tripleX, y, params
.There are a few things I need to take care about:
IncompatibleMetadataError
adapt
transform
that performsadapt
but discards weightsplot_subspace
we have quality ofSubspaceAlignment
lower than no adaptationCould be done later:
PerDomain
workSelectSource
workSelectTarget
workSelectSourceTarget
workAdaptationOutput
(maybe a separate task)TransformerMixin
fit
,fit_transform
, etc)