Closed shania-m closed 2 years ago
Scikit-learn defines decision_function
as:
In a fitted classifier or outlier detector, predicts a “soft” score for each sample in relation to each class, rather than the “hard” categorical prediction produced by predict. Its input is usually only some observed data, X.
If the estimator was not already fitted, calling this method should raise a exceptions.NotFittedError.
Scikit-learn's decision_function
is used to predict soft scores for samples. Predictions are out of scope for Rubicon, there is no need to implement logging for decision_function
.
Note, in cases like the EllipticEnvelope estimator, decision_function
is called by predict
, which is used in score
. Additionally , decision_function
utilizes score_samples
.
Score_samples
on the other hand is used to score individual scores across samples. This could be sum or the mean of all these scores are used to calculate score() for many estimators; such as the FactorAnalysis
Estimator, PCA
estimator, BayesianGaussianMixture
estimator, and HalvingGridSearchCV
estimator. Score_samples can also be used for density estimation. Scikit-learn examples show score_samples being used in Density Estimation, Density Estimation for a Gaussian Mixture, Kernel Density Estimation for Species Distributions , and Simple 1D Kernel Density Estimation.
Since score_samples()
can be explicitly used to density estimation and is used to calculate scores()
, Rubicon should support score_samples()
and logging. Similar to the solution proposed in #176, when a user calls score_samples()
, a new experiment should be opened unless a user specifies which experiment to log to.
Is your enhancement request related to a problem? Please describe
Scikit-Learn's pipeline api provides two additional methods that are not covered by Rubicon's Scikit-Learn integration: decision_function and score_samples.
Further investigate decision_function and score_samples to determine if these should be integrated in to Rubicon:
Additional Sources
decision_function
in Scikit-learn examples:score_sample
s in Scikit-learn examples: