jpmml / jpmml-transpiler

Java Transpiler (Translator + Compiler) API for PMML
GNU Affero General Public License v3.0
28 stars 2 forks source link

Number-only field names lead to invalid Java variable names #3

Closed ahmed-shariff closed 2 years ago

ahmed-shariff commented 4 years ago

I have a pmml model generated from sklearn. I am trying to convert that to jar using jpmml-transpiler. I tried using this:

java -jar target/jpmml-transpiler-executable-1.0-SNAPSHOT.jar --xml-input exports/classifier_dt_accel_1.pmml --jar-output exports/classifier_dt_accel_1.jar

and I get this error:

/PMML$708533063.java:136: error: <identifier> expected
            double2fp = arguments. 2fp();
                                  ^
/PMML$708533063.java:137: error: ')' expected
            if (2fp!= 2fp) {
                  ^
/PMML$708533063.java:137: error: not a statement
            if (2fp!= 2fp) {
                   ^
/PMML$708533063.java:137: error: ';' expected
            if (2fp!= 2fp) {
                        ^
/PMML$708533063.java:137: error: ';' expected
            if (2fp!= 2fp) {
                         ^
/PMML$708533063.java:140: error: ')' expected
            if (2fp<= -4.077794313430786D) {
                  ^
/PMML$708533063.java:140: error: not a statement
            if (2fp<= -4.077794313430786D) {
                   ^
/PMML$708533063.java:140: error: ';' expected
            if (2fp<= -4.077794313430786D) {
                                         ^
/PMML$708533063.java:141: error: ')' expected
                if (2fp<= -6.824077844619751D) {
                      ^
/PMML$708533063.java:141: error: not a statement
                if (2fp<= -6.824077844619751D) {
                       ^
/PMML$708533063.java:141: error: ';' expected
                if (2fp<= -6.824077844619751D) {
                                             ^
/PMML$708533063.java:142: error: ')' expected
                    if (2fp<= 7.007941961288452D) {
                          ^
/PMML$708533063.java:142: error: not a statement
                    if (2fp<= 7.007941961288452D) {
                           ^
/PMML$708533063.java:142: error: ';' expected
                    if (2fp<= 7.007941961288452D) {
                                                ^
/PMML$708533063.java:143: error: ')' expected
                        if (2fp<= -6.735283613204956D) {
                              ^
/PMML$708533063.java:143: error: not a statement
                        if (2fp<= -6.735283613204956D) {
                               ^
/PMML$708533063.java:143: error: ';' expected
                        if (2fp<= -6.735283613204956D) {
                                                     ^
/PMML$708533063.java:144: error: ')' expected
                            if (2fp<= -9.8182954788208D) {
                                  ^
/PMML$708533063.java:144: error: not a statement
                            if (2fp<= -9.8182954788208D) {
                                   ^
/PMML$708533063.java:144: error: ';' expected
                            if (2fp<= -9.8182954788208D) {
                                                       ^
/PMML$708533063.java:145: error: ')' expected
                                if (2fp<= -12.388869285583496D) {
                                      ^
/PMML$708533063.java:145: error: not a statement
                                if (2fp<= -12.388869285583496D) {
                                       ^
/PMML$708533063.java:145: error: ';' expected
                                if (2fp<= -12.388869285583496D) {
                                                              ^
/PMML$708533063.java:146: error: ')' expected
                                    if (2fp<= -12.410374641418457D) {
                                          ^
/PMML$708533063.java:146: error: not a statement
                                    if (2fp<= -12.410374641418457D) {
                                           ^
/PMML$708533063.java:146: error: ';' expected
                                    if (2fp<= -12.410374641418457D) {
                                                                  ^
/PMML$708533063.java:147: error: ')' expected
                                        if (2fp<= -8.991085529327393D) {
                                              ^
/PMML$708533063.java:147: error: not a statement
                                        if (2fp<= -8.991085529327393D) {
                                               ^
/PMML$708533063.java:147: error: ';' expected
                                        if (2fp<= -8.991085529327393D) {
                                                                     ^
/PMML$708533063.java:150: error: ')' expected
                                        if (2fp<= -8.268068313598633D) {
                                              ^
/PMML$708533063.java:150: error: not a statement
                                        if (2fp<= -8.268068313598633D) {
                                               ^
/PMML$708533063.java:150: error: ';' expected
                                        if (2fp<= -8.268068313598633D) {
                                                                     ^
/PMML$708533063.java:151: error: ')' expected
                                            if (2fp<= -8.82589340209961D) {
                                                  ^
/PMML$708533063.java:151: error: not a statement
                                            if (2fp<= -8.82589340209961D) {
                                                   ^
/PMML$708533063.java:151: error: ';' expected
                                            if (2fp<= -8.82589340209961D) {
                                                                        ^
/PMML$708533063.java:154: error: ')' expected
                                            if (2fp<= -8.601503849029541D) {
                                                  ^
/PMML$708533063.java:154: error: not a statement
                                            if (2fp<= -8.601503849029541D) {
                                                   ^
/PMML$708533063.java:154: error: ';' expected
                                            if (2fp<= -8.601503849029541D) {
                                                                         ^
/PMML$708533063.java:155: error: ')' expected
                                                if (2fp<= -7.086699962615967D) {
                                                      ^
/PMML$708533063.java:155: error: not a statement
                                                if (2fp<= -7.086699962615967D) {
                                                       ^
/PMML$708533063.java:155: error: ';' expected
                                                if (2fp<= -7.086699962615967D) {
                                                                             ^
/PMML$708533063.java:162: error: ')' expected
                                        if (2fp<= -6.187596082687378D) {
                                              ^
/PMML$708533063.java:162: error: not a statement
                                        if (2fp<= -6.187596082687378D) {
                                               ^
/PMML$708533063.java:162: error: ';' expected
                                        if (2fp<= -6.187596082687378D) {
                                                                     ^
/PMML$708533063.java:163: error: ')' expected
                                            if (2fp<= -7.056653738021851D) {
                                                  ^
/PMML$708533063.java:163: error: not a statement
                                            if (2fp<= -7.056653738021851D) {
                                                   ^
/PMML$708533063.java:163: error: ';' expected
                                            if (2fp<= -7.056653738021851D) {
                                                                         ^
/PMML$708533063.java:166: error: ')' expected
                                            if (2fp<= -13.465054988861084D) {
                                                  ^
/PMML$708533063.java:166: error: not a statement
                                            if (2fp<= -13.465054988861084D) {
                                                   ^
/PMML$708533063.java:166: error: ';' expected
                                            if (2fp<= -13.465054988861084D) {
                                                                          ^
/PMML$708533063.java:167: error: ')' expected
                                                if (2fp<= -14.216802597045898D) {
                                                      ^
/PMML$708533063.java:167: error: not a statement
                                                if (2fp<= -14.216802597045898D) {
                                                       ^
/PMML$708533063.java:167: error: ';' expected
                                                if (2fp<= -14.216802597045898D) {
                                                                              ^
/PMML$708533063.java:170: error: ')' expected
                                                if (2fp<= -13.95345163345337D) {
                                                      ^
/PMML$708533063.java:170: error: not a statement
                                                if (2fp<= -13.95345163345337D) {
                                                       ^
/PMML$708533063.java:170: error: ';' expected
                                                if (2fp<= -13.95345163345337D) {
                                                                             ^
/PMML$708533063.java:177: error: ')' expected
                                        if (2fp<= -12.848179817199707D) {
                                              ^
/PMML$708533063.java:177: error: not a statement
                                        if (2fp<= -12.848179817199707D) {
                                               ^
/PMML$708533063.java:177: error: ';' expected
                                        if (2fp<= -12.848179817199707D) {
                                                                      ^
/PMML$708533063.java:178: error: ')' expected
                                            if (2fp<= -13.673964977264404D) {
                                                  ^
/PMML$708533063.java:178: error: not a statement
                                            if (2fp<= -13.673964977264404D) {
                                                   ^
/PMML$708533063.java:178: error: ';' expected
                                            if (2fp<= -13.673964977264404D) {
                                                                          ^
/PMML$708533063.java:179: error: ')' expected
                                                if (2fp<= -13.304650783538818D) {
                                                      ^
/PMML$708533063.java:179: error: not a statement
                                                if (2fp<= -13.304650783538818D) {
                                                       ^
/PMML$708533063.java:179: error: ';' expected
                                                if (2fp<= -13.304650783538818D) {
                                                                              ^
/PMML$708533063.java:182: error: ')' expected
                                                if (2fp<= -13.144246101379395D) {
                                                      ^
/PMML$708533063.java:182: error: not a statement
                                                if (2fp<= -13.144246101379395D) {
                                                       ^
/PMML$708533063.java:182: error: ';' expected
                                                if (2fp<= -13.144246101379395D) {
                                                                              ^
/PMML$708533063.java:187: error: ')' expected
                                            if (2fp<= -13.16771125793457D) {
                                                  ^
/PMML$708533063.java:187: error: not a statement
                                            if (2fp<= -13.16771125793457D) {
                                                   ^
/PMML$708533063.java:187: error: ';' expected
                                            if (2fp<= -13.16771125793457D) {
                                                                         ^
/PMML$708533063.java:190: error: ')' expected
                                            if (2fp<= -12.955914497375488D) {
                                                  ^
/PMML$708533063.java:190: error: not a statement
                                            if (2fp<= -12.955914497375488D) {
                                                   ^
/PMML$708533063.java:190: error: ';' expected
                                            if (2fp<= -12.955914497375488D) {
                                                                          ^
/PMML$708533063.java:195: error: ')' expected
                                        if (2fp<= 4.0479161739349365D) {
                                              ^
/PMML$708533063.java:195: error: not a statement
                                        if (2fp<= 4.0479161739349365D) {
                                               ^
/PMML$708533063.java:195: error: ';' expected
                                        if (2fp<= 4.0479161739349365D) {
                                                                     ^
/PMML$708533063.java:196: error: ')' expected
                                            if (2fp<= -10.181861877441406D) {
                                                  ^
/PMML$708533063.java:196: error: not a statement
                                            if (2fp<= -10.181861877441406D) {
                                                   ^
/PMML$708533063.java:196: error: ';' expected
                                            if (2fp<= -10.181861877441406D) {
                                                                          ^
/PMML$708533063.java:197: error: ')' expected
                                                if (2fp<= -10.198620796203613D) {
                                                      ^
/PMML$708533063.java:197: error: not a statement
                                                if (2fp<= -10.198620796203613D) {
                                                       ^
/PMML$708533063.java:197: error: ';' expected
                                                if (2fp<= -10.198620796203613D) {
                                                                              ^
/PMML$708533063.java:202: error: ')' expected
                                            if (2fp<= -13.103546142578125D) {
                                                  ^
/PMML$708533063.java:202: error: not a statement
                                            if (2fp<= -13.103546142578125D) {
                                                   ^
/PMML$708533063.java:202: error: ';' expected
                                            if (2fp<= -13.103546142578125D) {
                                                                          ^
/PMML$708533063.java:205: error: ')' expected
                                            if (2fp<= -13.041299819946289D) {
                                                  ^
/PMML$708533063.java:205: error: not a statement
                                            if (2fp<= -13.041299819946289D) {
                                                   ^
/PMML$708533063.java:205: error: ';' expected
                                            if (2fp<= -13.041299819946289D) {
                                                                          ^
/PMML$708533063.java:210: error: ')' expected
                                        if (2fp<= 4.840362787246704D) {
                                              ^
/PMML$708533063.java:210: error: not a statement
                                        if (2fp<= 4.840362787246704D) {
                                               ^
/PMML$708533063.java:210: error: ';' expected
                                        if (2fp<= 4.840362787246704D) {
                                                                    ^
/PMML$708533063.java:217: error: ')' expected
                                if (2fp<= -6.775467395782471D) {
                                      ^
/PMML$708533063.java:217: error: not a statement
                                if (2fp<= -6.775467395782471D) {
                                       ^
/PMML$708533063.java:217: error: ';' expected
                                if (2fp<= -6.775467395782471D) {
                                                             ^
/PMML$708533063.java:218: error: ')' expected
                                    if (2fp<= -18.420361518859863D) {
                                          ^
/PMML$708533063.java:218: error: not a statement
                                    if (2fp<= -18.420361518859863D) {
                                           ^
/PMML$708533063.java:218: error: ';' expected
                                    if (2fp<= -18.420361518859863D) {
                                                                  ^
/PMML$708533063.java:219: error: ')' expected
                                        if (2fp<= -19.79218101501465D) {
                                              ^
/PMML$708533063.java:219: error: not a statement
                                        if (2fp<= -19.79218101501465D) {
                                               ^
100 errors
java.io.IOException
    at org.jpmml.codemodel.CompilerUtil.compile(CompilerUtil.java:81)
    at org.jpmml.codemodel.CompilerUtil.compile(CompilerUtil.java:56)
    at org.jpmml.codemodel.CompilerUtil.compile(CompilerUtil.java:49)
    at org.jpmml.transpiler.TranspilerUtil.compile(TranspilerUtil.java:80)
    at org.jpmml.transpiler.Main.run(Main.java:115)
    at org.jpmml.transpiler.Main.main(Main.java:98)
vruusmann commented 4 years ago

2fp

Your PMML contains a field whose name is a blank space " ".

Right now, JPMML-Transpiler does not contain any application logic that would check/normalize such oddities. It is assumed that the Java compiler step would find and report them (as is the case here).

Keeping this issue open as a reminder to introduce field name checking (and perhaps correct them, or replace with synthetic field names).

You should re-generate your PMML model with correct field names. Don't use a blank space as one.

vruusmann commented 4 years ago

JPMML-Transpiler does not contain any application logic that would check/normalize such oddities.

I remembered it wrong. Field names are translated to sanitized Java identifiers using the IdentifierUtil#sanitize(String) utility function: https://github.com/jpmml/jpmml-transpiler/blob/master/src/main/java/org/jpmml/translator/IdentifierUtil.java#L58-L82

The trouble is that your PMML document contains field names that transform to a blank string " ".

ahmed-shariff commented 4 years ago

I was generating the pmml files using jpmml-sklearn. Want me to mention this there as well?

vruusmann commented 4 years ago

Want me to mention this there as well?

That's not necessary - the problem is related to the naming of your column names (eg pandas.DataFrame column names), not the JPMML software stack.

For example, if your column name is all numeric (eg. "999"), then you'd get an empty string from IdentifierUtil#sanitize(String) method.

ahmed-shariff commented 4 years ago

I see, thank you for pointing that out