david-cortes / contextualbandits

Python implementations of contextual bandits algorithms
http://contextual-bandits.readthedocs.io
BSD 2-Clause "Simplified" License
750 stars 148 forks source link

Unable to pickle batch training models #5

Closed kaustubrao closed 5 years ago

kaustubrao commented 5 years ago

I think this is because _robust_predict used for batch training is a bounded function and does not get pickled properly. Getting the following error when i try to load the pickle file -

AttributeError: 'SGDClassifier' object has no attribute '_robust_predict'

david-cortes commented 5 years ago

I'm unable to reproduce the error. Does the following work for you?

import numpy as np, pickle
from sklearn.linear_model import SGDClassifier
from contextualbandits.online import AdaptiveGreedy
X = np.random.normal(size=(1000, 20))
a = np.random.randint(5, size=1000)
r = (np.random.random(size=1000) >= 0.4).astype('int64')
m = BootstrappedUCB(SGDClassifier(loss="log"), nchoices=5)
m.partial_fit(X, a, r)
pickle.dump(m, open("cb_pickle.p", "wb"))

Then loading with

import pickle
pickle.load(open("cb_pickle.p", "rb"))

What Python version are you using? Are you trying to load it on a different Python version (or virtual environment) than the one you saved the model in?

kaustubrao commented 5 years ago

In the code you shared, batch_train has not been set to True. The pickling and loading works fine, but when I try to use partial_fit on new data points, I get the following error -

AttributeError: '_ArrBSClassif' object has no attribute 'beta_counters'

If I set batch_train to True, and execute the same code, then the pickled object does not load and I get the error -

AttributeError: 'SGDClassifier' object has no attribute '_robust_predict'

This error occurs even if I pickle and load in the same notebook, so I don't think it is due to differences in python version (Python 3.6.4 is the version I'm using).

david-cortes commented 5 years ago

In the code you shared, batch_train has not been set to True. The pickling and loading works fine, but when I try to use partial_fit on new data points, I get the following error -

AttributeError: '_ArrBSClassif' object has no attribute 'beta_counters'

If I set batch_train to True, and execute the same code, then the pickled object does not load and I get the error -

AttributeError: 'SGDClassifier' object has no attribute '_robust_predict'

This error occurs even if I pickle and load in the same notebook, so I don't think it is due to differences in python version (Python 3.6.4 is the version I'm using).

Ok, got the error now, will try to fix it later today. Thanks for reporting the problem!

david-cortes commented 5 years ago

After taking a deeper look at it, I think making it work with pickle would imply modifying a large fraction of the code in order not to break other things, which I probably won’t do in the near future, so in the meantime I suggest you use dill instead of pickle, which doesn’t have problems serializing bound methods.

kaustubrao commented 5 years ago

I tried using dill. Seems to be working fine. Thanks for the tip!