Closed yyyooohao closed 2 years ago
Thanks for your interest in sklearn-pmml-model
! In order for me to help you find the problem, it would be great if you can stick to the issue template. Without an extract of the PMML model you are trying to convert, it is difficult for me to help you.
That being said, based on the error I think the library you used to export the PMML model has converted the original RegressionModel
to an equivalent GeneralRegressionModel
. If this is the case, you should be able to generate predictions using PMMLRidgeClassifier
for classification, or PMMLRidge
for regression.
Thanks for your interest in
sklearn-pmml-model
! In order for me to help you find the problem, it would be great if you can stick to the issue template. Without an extract of the PMML model you are trying to convert, it is difficult for me to help you.That being said, based on the error I think the library you used to export the PMML model has converted the original
RegressionModel
to an equivalentGeneralRegressionModel
. If this is the case, you should be able to generate predictions usingPMMLRidgeClassifier
for classification, orPMMLRidge
for regression.
I used SVM to predict before, and I want to use logistic regression to classify, test the accuracy of the results, and use logistic regression prediction under the linear model. I mainly want to try logistic regression for classification.
Thanks for your interest in
sklearn-pmml-model
! In order for me to help you find the problem, it would be great if you can stick to the issue template. Without an extract of the PMML model you are trying to convert, it is difficult for me to help you.That being said, based on the error I think the library you used to export the PMML model has converted the original
RegressionModel
to an equivalentGeneralRegressionModel
. If this is the case, you should be able to generate predictions usingPMMLRidgeClassifier
for classification, orPMMLRidge
for regression.
Well, I can use THE SVM export to PMML to make predictions, but the logical classification prediction will report an error
I suppose you are using PMMLLogisticRegression
to make 'logical classification' predictions? In my previous comment, I recommended to use PMMLRidgeClassifier
instead. To do that, just replace "PMMLLogisticRegression" with "PMMLRidgeClassifier". I think that should work for you.
I suppose you are using
PMMLLogisticRegression
to make 'logical classification' predictions? In my previous comment, I recommended to usePMMLRidgeClassifier
instead. To do that, just replace "PMMLLogisticRegression" with "PMMLRidgeClassifier". I think that should work for you.
Should I change my training to RidgeClassifier, or is there a problem with data processing? SVM can be a good test,Exception: PMML model does not contain GeneralRegressionModel.
I suppose you are using
PMMLLogisticRegression
to make 'logical classification' predictions? In my previous comment, I recommended to usePMMLRidgeClassifier
instead. To do that, just replace "PMMLLogisticRegression" with "PMMLRidgeClassifier". I think that should work for you.
Why is it easier for me to predict with SVM, but harder for me to predict with logistic regression? Is there any other model that can do better classification
I suppose you are using
PMMLLogisticRegression
to make 'logical classification' predictions? In my previous comment, I recommended to usePMMLRidgeClassifier
instead. To do that, just replace "PMMLLogisticRegression" with "PMMLRidgeClassifier". I think that should work for you.
If only classfier parameters can be predicted in PMMLPipeline, but the accuracy of the result is not high, the logistic regression parameters need to be adjusted to reach a certain precision value.
I am not entirely sure what your problem is. It would be helpful if you can provide a copy of the PMML file that you having problems with.
In your screenshot you show the method PMMLPipeline
. Do note this method is not part of this library, but from sklearn2pmml
instead. That library converts sklearn models into PMML, as opposed to sklearn-pmml-model
creating a sklearn model from a PMML.
For me, PMMLLogisticRegression
works just fine. Check out this simple example on how to use it along with sklearn2pmml
:
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.linear_model import PMMLLogisticRegression
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn2pmml import sklearn2pmml
# Prepare data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"
# train logistic regression
clf = LogisticRegression()
pipeline = PMMLPipeline([
("classifier", clf)
])
pipeline.fit(X, y)
# convert to PMML
sklearn2pmml(pipeline, "test.pmml", with_repr = True)
# Load from PMML and predict
clf = PMMLLogisticRegression(pmml="test.pmml")
clf.predict(X)
clf.score(X, y)
I am not entirely sure what your problem is. It would be helpful if you can provide a copy of the PMML file that you having problems with.
In your screenshot you show the method
PMMLPipeline
. Do note this method is not part of this library, but fromsklearn2pmml
instead. That library converts sklearn models into PMML, as opposed tosklearn-pmml-model
creating a sklearn model from a PMML.For me,
PMMLLogisticRegression
works just fine. Check out this simple example on how to use it along withsklearn2pmml
:from sklearn.datasets import load_iris from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split import pandas as pd import numpy as np from sklearn_pmml_model.linear_model import PMMLLogisticRegression from sklearn2pmml.pipeline import PMMLPipeline from sklearn2pmml import sklearn2pmml # Prepare data iris = load_iris() X = pd.DataFrame(iris.data) X.columns = np.array(iris.feature_names) y = pd.Series(np.array(iris.target_names)[iris.target]) y.name = "Class" # train logistic regression clf = LogisticRegression() pipeline = PMMLPipeline([ ("classifier", clf) ]) pipeline.fit(X, y) # convert to PMML sklearn2pmml(pipeline, "test.pmml", with_repr = True) # Load from PMML and predict clf = PMMLLogisticRegression(pmml="test.pmml") clf.predict(X) clf.score(X, y) ![image](https://user-images.githubusercontent.com/65326115/131059977-241fc793-70d7-4b5a-bb90-0a254779fd46.png) Logistic regression can be used, but it's not very accurate, only 40% accurate.Are there other networks that do categorization?
The parameters you show don't make a lot of sense to me. max_iter = 2
is way too low to yield any decent classification. I suggest you start with LogisticRegression()
, so without any arguments. See if that works (it should), and then gradually add arguments to see if it improves performance. Often enough, the default parameters prove to be sufficient.
If you like to try another model, I suggest trying RandomForestClassifier
.
The parameters you show don't make a lot of sense to me.
max_iter = 2
is way too low to yield any decent classification. I suggest you start withLogisticRegression()
, so without any arguments. See if that works (it should), and then gradually add arguments to see if it improves performance. Often enough, the default parameters prove to be sufficient.If you like to try another model, I suggest trying
RandomForestClassifier
.
The test accuracy of default parameters is not high, which can only reach half of SVM, and it needs to be adjusted, and it does not need too complex network model.
The parameters you show don't make a lot of sense to me.
max_iter = 2
is way too low to yield any decent classification. I suggest you start withLogisticRegression()
, so without any arguments. See if that works (it should), and then gradually add arguments to see if it improves performance. Often enough, the default parameters prove to be sufficient.If you like to try another model, I suggest trying
RandomForestClassifier
.
I tried the random forest,ModuleNotFoundError: No module named 'sklearn_pmml_model.tree._tree'.I use three categories
Please make sure you installed the library using pip install sklearn-pmml-model
. This error seems to indicate the Cython code is not compiled, which is only the case if you downloaded this library and are working in that directory directly.
If you, for some reason, cannot use pip
, running the following command will compile the Cython code inplace, and should fix the issue you have:
python setup.py build_ext --inplace
I don't recommend this, and it will require a C compiler, which is a bit of a pain to setup on windows. More information about this process can be found at https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#from-source.
Please make sure you installed the library using
pip install sklearn-pmml-model
. This error seems to indicate the Cython code is not compiled, which is only the case if you downloaded this library and are working in that directory directly.If you, for some reason, cannot use
pip
, running the following command will compile the Cython code inplace, and should fix the issue you have:python setup.py build_ext --inplace
I don't recommend this, and it will require a C compiler, which is a bit of a pain to setup on windows. More information about this process can be found at https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#from-source.
I installed the package according to Requerment.txt
Please make sure you installed the library using
pip install sklearn-pmml-model
. This error seems to indicate the Cython code is not compiled, which is only the case if you downloaded this library and are working in that directory directly.If you, for some reason, cannot use
pip
, running the following command will compile the Cython code inplace, and should fix the issue you have:python setup.py build_ext --inplace
I don't recommend this, and it will require a C compiler, which is a bit of a pain to setup on windows. More information about this process can be found at https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#from-source.
If I use logistic regression to do the tripartite model can't it predict
which is only the case if you downloaded this library and are working in that directory directly.
I can use PIP, how can I simply use random forest, I don't want to install c compiler.
Please make sure you installed the library using
pip install sklearn-pmml-model
. This error seems to indicate the Cython code is not compiled, which is only the case if you downloaded this library and are working in that directory directly.If you, for some reason, cannot use
pip
, running the following command will compile the Cython code inplace, and should fix the issue you have:python setup.py build_ext --inplace
I don't recommend this, and it will require a C compiler, which is a bit of a pain to setup on windows. More information about this process can be found at https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#from-source. Why do I use logistic regression to do the binary classification of such errors, the first two days can also do three classifications will report errors ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 1024)
If you use pip
to install the library, no C compiler is necessary. More information on how to install using pip can be found in the documentation: https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#pip.
pip
is the standard package manager for Python, and is included with every Python install. The documentation includes a link to more general information about pip
here: https://packaging.python.org/tutorials/installing-packages/#use-pip-for-installing.
If you use
pip
to install the library, no C compiler is necessary. More information on how to install using pip can be found in the documentation: https://sklearn-pmml-model.readthedocs.io/en/latest/install.html#pip.
pip
is the standard package manager for Python, and is included with every Python install. The documentation includes a link to more general information aboutpip
here: https://packaging.python.org/tutorials/installing-packages/#use-pip-for-installing.
I installed packages from Requiest with PIP. Why do I get errors with those models
Why do I get errors with those models
You have to let me know which errors you are seeing, otherwise I cannot help you.
I am expecting you still installed the packages with pip
but are still within a clone of this package. If you are working in a copy of this repository, please remove it, start fresh, do a pip install, and try out the example I provided here: https://github.com/iamDecode/sklearn-pmml-model/issues/35#issuecomment-906271001. If this works, you can proceed to try different models and datasets.
Why do I get errors with those models
You have to let me know which errors you are seeing, otherwise I cannot help you.
I am expecting you still installed the packages with
pip
but are still within a clone of this package. If you are working in a copy of this repository, please remove it, start fresh, do a pip install, and try out the example I provided here: #35 (comment). If this works, you can proceed to try different models and datasets.
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 1024) The error occurred when I used logistic regression or ridge regression, it is ok to carry out binary classification before logistic regression, can triple classification be used? I mainly use it to test binary classification and triple classification. If it is triple classification, do I need to make any modifications。
Why do I get errors with those models
You have to let me know which errors you are seeing, otherwise I cannot help you.
I am expecting you still installed the packages with
pip
but are still within a clone of this package. If you are working in a copy of this repository, please remove it, start fresh, do a pip install, and try out the example I provided here: #35 (comment). If this works, you can proceed to try different models and datasets.
Well, use the package version, but don't use it directly in your project.
Why do I get errors with those models
You have to let me know which errors you are seeing, otherwise I cannot help you.
I am expecting you still installed the packages with
pip
but are still within a clone of this package. If you are working in a copy of this repository, please remove it, start fresh, do a pip install, and try out the example I provided here: #35 (comment). If this works, you can proceed to try different models and datasets.
I used logistic to classify them into three categories and found Exception: PMML model does not contain RegressionModel. Reinstalled the package, the dichotomies can be predicted, ridge regression is also such a problem.
Why do I get errors with those models
You have to let me know which errors you are seeing, otherwise I cannot help you.
I am expecting you still installed the packages with
pip
but are still within a clone of this package. If you are working in a copy of this repository, please remove it, start fresh, do a pip install, and try out the example I provided here: #35 (comment). If this works, you can proceed to try different models and datasets.
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject, Do I need to do some configuration when I use GBDT classification.
I used logistic to classify them into three categories and found Exception: PMML model does not contain RegressionModel. Reinstalled the package, the dichotomies can be predicted, ridge regression is also such a problem.
Ok I think I understand now. You seem to be using the multi_class='ovr'
parameter on your LogisticRegression
class (from https://github.com/iamDecode/sklearn-pmml-model/issues/35#issuecomment-906867774). This means one-versus-rest regression. This type is not explicitly supported by the library yet, but I am working on adding it right now.
To get it working in the mean time, you can use the default parameter multi_class='auto'
or specifically select multi_class='multinomial'
instead. This type of regression should work!
I used logistic to classify them into three categories and found Exception: PMML model does not contain RegressionModel. Reinstalled the package, the dichotomies can be predicted, ridge regression is also such a problem.
Ok I think I understand now. You seem to be using the
multi_class='ovr'
parameter on yourLogisticRegression
class (from #35 (comment)). This means one-versus-rest regression. This type is not explicitly supported by the library yet, but I am working on adding it right now.To get it working in the mean time, you can use the default parameter
multi_class='auto'
or specifically selectmulti_class='multinomial'
instead. This type of regression should work!
Well, I had a logistic triage error,Exception: PMML model does not contain RegressionModel.
I used logistic to classify them into three categories and found Exception: PMML model does not contain RegressionModel. Reinstalled the package, the dichotomies can be predicted, ridge regression is also such a problem.
Ok I think I understand now. You seem to be using the
multi_class='ovr'
parameter on yourLogisticRegression
class (from #35 (comment)). This means one-versus-rest regression. This type is not explicitly supported by the library yet, but I am working on adding it right now.To get it working in the mean time, you can use the default parameter
multi_class='auto'
or specifically selectmulti_class='multinomial'
instead. This type of regression should work!
soga,Three categories running, ha ha
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject,
This error typically means you have to re-install numpy (pip install numpy --upgrade
)
soga,Three categories running, ha ha
Glad you got it working! I have just released a new version that should also work with multi_class='ovr'
. If your initial problem is resolved, can I close this issue?
The logistic regression that I use, the linear model that I use, it says in the document that logistic regression is included, why does it show up when I predict PMML model does not contain RegressionModel.