Closed moezali1 closed 1 year ago
Hi @moezali1 Thanks for reporting the issue. The problem is that patching must be applied before calling sklearn. patching replaces some methods from sklearn with their optimizions from the intel extension for scikit-learn. In your case, you are importing pycaret first, where sklearn is already imported. However, I found that if you call the patch before all other code, your code that you have attached will not work. This is not expected behavior. The reason for this is a bug on our side, which is related to train_test_split. I already created an issue I already have a fix and I will create a PR with it in the near future. After the fix everything seems to work as expected:
We also think that the patching option may not be the easiest for integration and perhaps the best option would be to import the required methods directly from sklearnex. The autogluon team integrated sklearnex in a similar way.
@moezali1 Also like an option you can use patching for some algorithms. For example this line will work for your code:
from sklearnex import patch_sklearn
patch_sklearn("knn_classifier")
from pycaret.datasets import get_data
data = get_data('poker')
from pycaret.classification import *
s = setup(data, target = 'CLASS', session_id = 126)
To get map of algorithms:
import daal4py as d4p
d4p.sklearn.monkeypatch.dispatcher._get_map_of_algorithms().keys()
>>
dict_keys(['pca', 'kmeans', 'dbscan', 'distances', 'linear', 'ridge', 'elasticnet', 'lasso', 'svm', 'logistic', 'log_reg', 'knn_classifier', 'nearest_neighbors', 'knn_regressor', 'random_forest_classifier', 'random_forest_regressor', 'train_test_split', 'fin_check', 'roc_auc_score', 'tsne', 'svc', 'logisticregression', 'kneighborsclassifier', 'nearestneighbors', 'kneighborsregressor', 'randomrorestclassifier', 'randomforestregressor'])
I can also help you create a PR for integration sklearnex into pycaret.
@PivovarA Thanks for your detailed response. I look forward to receiving your PR on pycaret. Excited for this integration.
@PivovarA We are releasing 3.0-rc in 2 weeks. Do you think its possible to send a PR our way on the develop
branch before May 15th?
Thanks.
Hi,
We are planning to integrate scikit-learn-intelex project with PyCaret
The issue is as following:
This took 10 minute despite of
patch_sklearn
command.However when I explicitly import the model it gives great results from acceleration:
Expected Action:
What can we do to make the first attempt gives result of acceleration so that users won't have to import estimator explicitly. I am thinking we can add a parameter in the
setup
function calleduse_intel_acceleration
. When that is set toTrue
by user we should runpatch_sklearn
command in our code base so that users won't have to do any thing explicitly outside of PyCaret code base.The file which creates model container in PyCaret repo are located here.