Instances of HistogramBinning cannot be cloned using sklearn's clone function.
This is relevant because it is used for example in the sklearn cross_val_score function.
This error is due to the HistogramBinning estimator class not following sklearn's requirements for an estimator, specifically this following one:
In addition, every keyword argument accepted by init should correspond to an attribute on the instance. Scikit-learn relies on this to find the relevant attributes to set on an estimator when doing model selection.
How to reproduce
from kyle.calibration.calibration_methods import HistogramBinning
from sklearn import clone
clone(HistogramBinning())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/anesbenmerzoug/workdir/repositories/transferlab/experiments-for-the-calibration-paper/venv/lib/python3.9/site-packages/sklearn/base.py", line 84, in clone
new_object_params = estimator.get_params(deep=False)
File "/home/anesbenmerzoug/workdir/repositories/transferlab/experiments-for-the-calibration-paper/venv/lib/python3.9/site-packages/sklearn/base.py", line 210, in get_params
value = getattr(self, key)
AttributeError: 'HistogramBinning' object has no attribute 'bins'
Suggested fix
To fix this we could simply change the HistogramBinning class from:
class HistogramBinning(NetcalBasedCalibration[bn.HistogramBinning]):
def __init__(self, bins=20):
super().__init__(bn.HistogramBinning(bins=bins))
to:
class HistogramBinning(NetcalBasedCalibration[bn.HistogramBinning]):
def __init__(self, bins=20):
self.bins = bins
super().__init__(bn.HistogramBinning(bins=bins))
Explanation
Instances of
HistogramBinning
cannot be cloned using sklearn'sclone
function. This is relevant because it is used for example in the sklearncross_val_score
function.This error is due to the
HistogramBinning
estimator class not following sklearn's requirements for an estimator, specifically this following one:How to reproduce
Suggested fix
To fix this we could simply change the
HistogramBinning
class from:to: