Open hombit opened 7 months ago
Are this names (fit_known
, tune_known
) exist in sklearn?
Related to #113
Are this names (
fit_known
,tune_known
) exist in sklearn?
No, but it would be weird if .fit
and fit_known
behavior would be different in this way
Then I would propose the following alternative since it seems that having three functions is redundant:
.fit(data, known_labels = None, known_data = None)
would refit every time it is called dropping all the previous training..fit_known(known_labels, known_data = None)
doesn't accept data
and doesn't do refit.
Mind known_labels
and known_data
order. If known_data
is missed then known_labels
are associated with data
itself.
I would try to be duck-consistent with scikit-learn
, including .fit(X, y)
And .fit(data, known_labels)
is mostly the same as .fit(X, y)
.
Currently,
AADForest.fit()
would do nothing when called second time, while withscikit-learn
it would would cause model retraining. The same applies forfit_known()
which just ignoresdata
argument, even if it differs fromdata
the model was previously trained with.Here I propose to modify the
Coniferest
interface to add an additional method,.tune_known(known_data, known_labels)
. In this case:.fit(data)
would refit every time it is called dropping all the previous training.fit_known(data, known_data, known_labels)
would also refit.tune_known(known_data, known_labels)
would use the same "base" (isolation forest) model and tune it for labeled data. It would fail if called before.fit
or.fit_known