iamDecode / sklearn-pmml-model

A library to parse and convert PMML models into Scikit-learn estimators.
BSD 2-Clause "Simplified" License
76 stars 15 forks source link

Unable to load PMML created in Matlab #21

Closed gabrielaozegovic closed 3 years ago

gabrielaozegovic commented 3 years ago

Hi,

I need to load the RandomForestClassifier model, created in Matlab, in my Python code. So far, this library is the best I could find, but I am unable to do so. I get this error:

ValueError                                Traceback (most recent call last)
<ipython-input-39-380815af9ac8> in <module>
----> 1 clf = PMMLForestClassifier(pmml="COVID_d02_sDevelop_m01_x01.pmml")

~\Anaconda3\envs\piton\lib\site-packages\sklearn_pmml_model\ensemble\forest.py in __init__(self, pmml, n_jobs)
     70     self.template_estimator = clf
     71 
---> 72     self.estimators_ = [self.get_tree(s) for s in valid_segments]
     73 
     74     # Required after constructing trees, because categories may be inferred in

~\Anaconda3\envs\piton\lib\site-packages\sklearn_pmml_model\ensemble\forest.py in <listcomp>(.0)
     70     self.template_estimator = clf
     71 
---> 72     self.estimators_ = [self.get_tree(s) for s in valid_segments]
     73 
     74     # Required after constructing trees, because categories may be inferred in

~\Anaconda3\envs\piton\lib\site-packages\sklearn_pmml_model\ensemble\forest.py in get_tree(self, segment)
    127       'values': value_ndarray
    128     }
--> 129     tree.tree_.__setstate__(state)
    130 
    131     return tree

~\Anaconda3\envs\piton\lib\site-packages\sklearn_pmml_model\tree\_tree.pyx in sklearn_pmml_model.tree._tree.Tree.__setstate__()

ValueError: Did not recognise loaded array layout

The PMML export from Matlab version is 4.3, is maybe that an issue? Are there some requirements when it comes to the versions support?

Thank you in advance!

iamDecode commented 3 years ago

Thanks for reporting @gabrielaozegovic. I have not explicitly tested with PMML models generated from Matlab before, so I cannot guarantee it will work. It could be nice if you could share the pmml model such that I can make sure it works.

However, the error you shared does not seem to be related to the PMML, but rather an incompatible version of sklearn, or perhaps processor architecture. I was not able to reproduce the problem on any of my machines. If you happen to have an outdated version of numpy and/or sklearn, try to update those and see if it solves your issue. If not, could you share the output of pip freeze and your OS (including 32 or 64 bit)?

gabrielaozegovic commented 3 years ago

Hi,

first, big thanks for the quick response.

I am sending you the pmml model here, and the output of "pip freeze" in a txt file. Also, I am using Windows 10, 64bit.

pip-freeze.txt

Thank you!

iamDecode commented 3 years ago

Thanks for the details! I managed to track down the problem and have released version 0.0.15 that should fix your problem.

In a nutshell, the model you shared predicts either of 3 target classes (0,1 and 2). However, one of the trees in the ensemble seemed to only predicts 2 classes instead of 3 (0 and 1). This is rather unexpected, but the new version deals with this now.

Let me know if the problem is fixed and you manage to get it working! If it is fixed but you run into different errors, please close this GitHub issue and start a new one :).

iamDecode commented 3 years ago

Closing due to inactivity. Assuming this issue is fixed.