Xtra-Computing / thundersvm

ThunderSVM: A Fast SVM Library on GPUs and CPUs
Apache License 2.0
1.57k stars 217 forks source link

Unable to pickle ensemble model that includes thundersvm (SVC) #157

Closed tigertimwu closed 4 years ago

tigertimwu commented 5 years ago

Hi,

I was trying to pickle an EnsembleVoteClassifier using mlxtend where one of the classfier included is thundersvm. However I am unable to pickle the EnsembleVoteClassifier using pickle.dump() as long as the thundersvm classifier is included, the following error comes out:

PicklingError: Can't pickle <class 'thundersvm.thundersvm.c_float_Array_629'>: attribute lookup c_float_Array_629 on thundersvm.thundersvm failed.

Is there any way that I can save down the ensemble classifier that includes thundersvm?

Thank you

zeyiwen commented 5 years ago

Would you provide a minimal example to help us reproduce the problem?

tigertimwu commented 5 years ago

clf_1=SVC() clf_1.load_from_file(path=svm_path+clf1_file_name)

clf_2=pickle.load(open(xg_boost_path+clf2_file_name, 'rb'))

pipe1 = make_pipeline(ColumnSelector(cols=df1.columns.tolist()), clf_1)

pipe2 = make_pipeline(ColumnSelector(cols=df2.columns.tolist()), clf_2)

eclf = EnsembleVoteClassifier(clfs=[pipe1,pipe2], weights=[1,1], refit=False,voting='hard',verbose=2)

labels = ['svm','xgboost','ensemble']

eclf.fit(X_train_final, y_train)

model_name=open('D:\ensemble_test.model', 'wb')

pickle.dump(eclf, model_name) model_name.close()

The error occurs at the line pickle.dump(eclf, model_name), clf_1 is thundersvm model, where clf2 is xgboost model, I have no issue to picke the eclf model (ie. the Ensemble model) if all classifiers in the ensemble are xgboost.

Thanks

zeyiwen commented 5 years ago

Thanks for that! We will look into it.

CMobley7 commented 5 years ago

I'm experiencing this issue as well when I run

from thundersvm import SVC
import pickle
svm = SVC(probability=True, gamma="auto")
svm.fit(x_train, y_train)
pickled_svm = pickle.dumps(svm)

I get

_pickle.PicklingError: Can't pickle <class 'thundersvm.thundersvm.c_int_Array_1'>: attribute lookup c_int_Array_1 on thundersvm.thundersvm failed
QinbinLi commented 5 years ago

Hi,

ThunderSVM doesn't support pickle now due to ctype variables used in the code. We'll try to improve it in the future. Thanks.

beevabeeva commented 5 years ago

Also have this problem, when trying to implement thundersvm in an automated machine learning program. However, the maintainers of that program seem to be pushing to remove pickle from the framework in favour of JSON.

ghasemikasra39 commented 4 years ago

same issue

guillaumedsde commented 4 years ago

Kindly bumping this issue :)

@beevabeeva who's pickling to JSON?

civilinformer commented 4 years ago

The simplest way to implement this is save the thundersvm parts of the ensemble separately usings its own save method, then replace the thundersvm parts from the ensemble object with dummy objects and save normally. To load invert the process, loading the individual models and the ensemble and placing the svms back into their respective places in the ensemble. I got that to work.

zeyiwen commented 4 years ago

This issue has been fixed by a contribution (#207 ) from @RichardWarfield. I would like to close this issue now. Please try the lastest version of ThunderSVM and let us know if there is any new issue @civilinformer

civilinformer commented 4 years ago

Is this available through pip?

zeyiwen commented 4 years ago

Yes. We have uploaded the wheel to pypi. Please try it and let us know if there is any problem.