Closed antoinedemathelin closed 7 months ago
Hi @antoinedemathelin, the removal happens in the Selector
, e.g. here for a single source/single target implementation. Would the result be different if you use DomainAwareDataset
to construct X, y, sample_domain
tuple with pack_train
?
I ran the example with proper masking of the inputs, found an old bug with regression masks. Please check out this PR #86
Hi @antoinedemathelin! I just noticed that the issue was automatically closed when the related PR was merge. Did you have a chance to verify it works as expected now? (or should we reopen the issue)
Hi everyone,
When looking at the
plot_label_comparison
example, I observe very accurate predictions of the DA methods for cases where DA should fail. After some investigation, it seems that the estimator is fitted on both source and target labels, while it should be fitted only on source. I made a little example with KLIEP highlighting this behavior:Here, the source and target inputs are the same
X_source = X_target
while the labels are differenty_source = 1
andy_target = 0
. If, the target labels are unknown, kliep should predict 1, while it predicts 0.5 here. I suspect, then, that it has fitted the linear regressor on target labels too.When looking at the code of
make_da_pipeline
I do not see where the target labels are removed before fitting the estimator ?