aai-institute / kyle

A library for calibrating classifiers and computing calibration metrics
Other
13 stars 1 forks source link

HistogramBinning cannot be cloned #2

Closed AnesBenmerzoug closed 2 years ago

AnesBenmerzoug commented 2 years ago

Explanation

Instances of HistogramBinning cannot be cloned using sklearn's clone function. This is relevant because it is used for example in the sklearn cross_val_score function.

This error is due to the HistogramBinning estimator class not following sklearn's requirements for an estimator, specifically this following one:

In addition, every keyword argument accepted by init should correspond to an attribute on the instance. Scikit-learn relies on this to find the relevant attributes to set on an estimator when doing model selection.

How to reproduce

from kyle.calibration.calibration_methods import HistogramBinning
from sklearn import clone

clone(HistogramBinning())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/anesbenmerzoug/workdir/repositories/transferlab/experiments-for-the-calibration-paper/venv/lib/python3.9/site-packages/sklearn/base.py", line 84, in clone
    new_object_params = estimator.get_params(deep=False)
  File "/home/anesbenmerzoug/workdir/repositories/transferlab/experiments-for-the-calibration-paper/venv/lib/python3.9/site-packages/sklearn/base.py", line 210, in get_params
    value = getattr(self, key)
AttributeError: 'HistogramBinning' object has no attribute 'bins'

Suggested fix

To fix this we could simply change the HistogramBinning class from:

class HistogramBinning(NetcalBasedCalibration[bn.HistogramBinning]):
    def __init__(self, bins=20):
        super().__init__(bn.HistogramBinning(bins=bins))

to:

class HistogramBinning(NetcalBasedCalibration[bn.HistogramBinning]):
    def __init__(self, bins=20):
        self.bins = bins
        super().__init__(bn.HistogramBinning(bins=bins))
MischaPanch commented 2 years ago

Fixed in #3