jpmml / jpmml-sklearn

Java library and command-line application for converting Scikit-Learn pipelines to PMML
GNU Affero General Public License v3.0
531 stars 117 forks source link

Converting to pmml programatically #24

Closed KidCrippler closed 7 years ago

KidCrippler commented 7 years ago

The converter-executable jar converts .pkl files to .pmml files and is very convenient to use. However, I wish to do that process programatically as part of my application lifecycle. Is there a way to do that without manually packing the converter-executable jar with my code base and calling its Mainclass' main() method? Maybe a java class that does the exact same thing from within any of the artifacts of the evaluator maven dependency?

10x

vruusmann commented 7 years ago

What's wrong with using the class org.jpmml.sklearn.Main directly?

Main converter = new Main();
converter.setMapperInput(new File("iris_mapper.pkl"));
converter.setEstimatorInput(new File("iris_classifier.pkl"));
converter.setOutput(new File("iris.pmml"));
converter.run(); // After this method returns, there will be a new file available in the designated location

If the visibility of the method org.jpmml.sklearn.Main#run() were relaxed from private to public, then this should work?

KidCrippler commented 7 years ago

@vruusmann 10x for the quick response

The problem with this approach is the fact that when maintaining a maven codebase, packing a standalone jar artifact with your application is very cumbersome. Using the jpmml-evaluator is very easy for me because its artifacts are neatly packed within a maven dependency, but in order to use the converter-executable.jar I need to manually build the project.

Another thing worth mentioning, is that when calling the Main method directly, I have to read/write actual files and rely on file system storage, as opposed to working with streams, which is less error prone because everything is in the RAM.

As for the code snippet, it would work provided that the run() method is exposed as public, but even if not, I could always call the main() method with a static list of arguments just as I would've done from the command line.

But then again, the converter-executable.jar artifact should probably be deployed along with the rest of the jars contained in the jpmml-evaluator project.