snad-space / coniferest

https://coniferest.snad.space
MIT License
7 stars 3 forks source link

Multiple calls of .fit() has inconsistent behavior with scikit-learn #167

Open hombit opened 7 months ago

hombit commented 7 months ago

Currently, AADForest.fit() would do nothing when called second time, while with scikit-learn it would would cause model retraining. The same applies for fit_known() which just ignores data argument, even if it differs from data the model was previously trained with.

Here I propose to modify the Coniferest interface to add an additional method, .tune_known(known_data, known_labels). In this case:

matwey commented 7 months ago

Are this names (fit_known, tune_known) exist in sklearn?

matwey commented 7 months ago

Related to #113

hombit commented 7 months ago

Are this names (fit_known, tune_known) exist in sklearn?

No, but it would be weird if .fit and fit_known behavior would be different in this way

matwey commented 7 months ago

Then I would propose the following alternative since it seems that having three functions is redundant:

hombit commented 7 months ago

I would try to be duck-consistent with scikit-learn, including .fit(X, y)

matwey commented 7 months ago

And .fit(data, known_labels) is mostly the same as .fit(X, y).