autodeployai / pypmml

Python PMML scoring library
Apache License 2.0
76 stars 22 forks source link

When I load a created pmml file to back into python, it takes lot of time to predict #53

Closed AhmetMericOzcan closed 1 year ago

AhmetMericOzcan commented 1 year ago

As you can see I wrote a simple code for creating a pmml file and loding it back to python. Even the prediction results are same after I load and predict with pmml. However prediction takes a lot of time. Could you please tell possible reason for this?

from sklearn2pmml import make_pmml_pipeline, sklearn2pmml
from pypmml import Model

myModelPmml = make_pmml_pipeline(myModel)
sklearn2pmml(myModelPmml , "myModel.pmml")

convertedModel = Model.fromFile('myModel.pmml')
predictionResults = convertedModel.predict(someData)`
scorebot commented 1 year ago

@AhmetMericOzcan This is a known issue that is mainly caused by the Py4J backend using the JVM via TCP/IP sockets, we're considering supporting the other backends like PyJNIus and JPype using the JVM via JNI for the next main release.

scorebot commented 1 year ago

@AhmetMericOzcan Besides of affected by different JVM backends, we still found other bottlenecks in predicting the data in a DataFrame or NumPy array.

Can you please reinstall the latest version from GitHub by the following command?

pip install --upgrade git+https://github.com/autodeployai/pypmml.git

And try to test again if the performance is fine

AhmetMericOzcan commented 1 year ago

@scorebot Sorry for the late reply, confirming output was enough for me. I couldn't test it after the update. One thing I can not that, even if the predictions are same, the output formats were different.

Like I said, current results are enough for me, since I am not using that conversion in a performance dependent system. However thank you very much for your update. You can close the issue if you like.

scorebot commented 1 year ago

It's fine, I close this issue now. Please feel free to open new ones for any other issues.