Open geiseri opened 8 years ago
If you have a unit case we can add to the test suite it might help illuminate the issue.
Generally, if you need to special-case the handling of a specific class then it can either implement the __getstate__(self) -> data
and __setstate__(self, data)
protocol, or you can register a custom handler.
I have the same problem. I think I found what happens. If I specify every parameter that the init method has, everything works correctly, but if I don't, jsonpickle instances them to None
These are the default scikit-learn parameters:
jsonpickle.decode('''{
"py/object": "sklearn.ensemble.AdaBoostRegressor",
"py/state": {
"base_estimator": null,
"n_estimators": 50,
"learning_rate": 1.0,
"loss": "linear",
"random_state": null
}}''')
AdaBoostRegressor(base_estimator=None, learning_rate=1.0, loss='linear', n_estimators=50, random_state=None)
However if I do this:
jsonpickle.decode('''{"py/object": "sklearn.ensemble.AdaBoostRegressor"}''')
I get all Nones:
AdaBoostRegressor(base_estimator=None, learning_rate=None, loss=None, n_estimators=None, random_state=None)
I understand the behaviour of this, jsonpickle does not initialize the object, it unpickles it as if it were already initialized and saved.
However I think my problem is a kinda common use case and it would be really cool if we had the chance to tell jsonpickle.decode that we want it to call the init method instead of trying to load an object.
EDIT: Actually you cannot use the loaded model that way:
See the model with all its params:
jsonpickle.encode(AdaBoostRegressor())
'{"py/object": "sklearn.ensemble.weight_boosting.AdaBoostRegressor", "py/state": {"_sklearn_version": "0.21.2", "base_estimator": null, "estimator_params": {"py/tuple": []}, "learning_rate": 1.0, "loss": "linear", "n_estimators": 50, "random_state": null}}'
However, this is not the same:
jsonpickle.encode(jsonpickle.decode('''{"py/object": "sklearn.ensemble.AdaBoostRegressor"}'''))
'{"py/object": "sklearn.ensemble.weight_boosting.AdaBoostRegressor", "py/state": {"_sklearn_version": "0.21.2"}}'
Actually I cannot fit with that last one:
X, y = load_boston(return_X_y=True)
jsonpickle.decode('''{"py/object": "sklearn.ensemble.AdaBoostRegressor"}''').fit(X, y)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-20-1a87b9f2bcc2> in <module>
1 from sklearn.datasets import load_iris, load_boston
2 X, y = load_boston(return_X_y=True)
----> 3 jsonpickle.decode('''{"py/object": "sklearn.ensemble.AdaBoostRegressor"}''').fit(X, y)
~/.conda/envs/xxxx/lib/python3.6/site-packages/sklearn/ensemble/weight_boosting.py in fit(self, X, y, sample_weight)
989 """
990 # Check loss
--> 991 if self.loss not in ('linear', 'square', 'exponential'):
992 raise ValueError(
993 "loss must be 'linear', 'square', or 'exponential'")
AttributeError: 'AdaBoostRegressor' object has no attribute 'loss'
Even if i tried to write all the parameters by myself I could not use it because it has other private class attributes:
jsonpickle.decode('''{
"py/object": "sklearn.ensemble.AdaBoostRegressor",
"py/state": {
"base_estimator": null,
"n_estimators": 50,
"learning_rate": 1.0,
"loss": "linear",
"random_state": null
}}''').fit(X,y)
AttributeError Traceback (most recent call last)
<ipython-input-22-12cce3852ba8> in <module>
7 "loss": "linear",
8 "random_state": null
----> 9 }}''').fit(X,y)
~/.conda/envs/xxxx/lib/python3.6/site-packages/sklearn/ensemble/weight_boosting.py in fit(self, X, y, sample_weight)
994
995 # Fit
--> 996 return super().fit(X, y, sample_weight)
997
998 def _validate_estimator(self):
~/.conda/envs/xxxx/lib/python3.6/site-packages/sklearn/ensemble/weight_boosting.py in fit(self, X, y, sample_weight)
148 X, y,
149 sample_weight,
--> 150 random_state)
151
152 # Early termination
~/.conda/envs/xxxx/lib/python3.6/site-packages/sklearn/ensemble/weight_boosting.py in _boost(self, iboost, X, y, sample_weight, random_state)
1039 If None then boosting has terminated early.
1040 """
-> 1041 estimator = self._make_estimator(random_state=random_state)
1042
1043 # Weighted sampling of the training set with replacement
~/.conda/envs/xxxx/lib/python3.6/site-packages/sklearn/ensemble/base.py in _make_estimator(self, append, random_state)
126 estimator = clone(self.base_estimator_)
127 estimator.set_params(**{p: getattr(self, p)
--> 128 for p in self.estimator_params})
129
130 if random_state is not None:
AttributeError: 'AdaBoostRegressor' object has no attribute 'estimator_params
On restoring from JSON the objects are created but properties that are defined in init are not created if they are missing in the JSON. I am not sure if this is intended though. Is there a way to define the behavior of missing properties in the object?