oracle / tribuo

Tribuo - A Java machine learning library
https://tribuo.org
Apache License 2.0
1.27k stars 175 forks source link

PMML Export/Import #46

Open exabrial opened 4 years ago

exabrial commented 4 years ago

It would be awesome to be able to import/export models in PMML format for portability to other languages and platforms:

Craigacp commented 4 years ago

We're currently looking at Tribuo's serialization mechanisms, and after that we're going to look at exporting to ONNX format (as it seems odd to import ONNX models when we can't export them). That ONNX model exporting code will probably involve decomposing Tribuo models into basic building blocks before converting those into ONNX and that could potentially overlap with the model primitives from PMML to share some export code. I've not looked at the PMML spec in too much detail, it seems like Tribuo has a few models PMML doesn't and PMML has models that Tribuo doesn't, so we'd have to figure out how to do the mapping.

With respect to the JPMML project, it's mostly AGPL licensed, which would be incompatible with Tribuo's Apache 2.0 license so we can't use it as a dependency.

We'd be happy to accept PRs which allowed export or import of PMML provided it didn't add incompatible dependencies license-wise, but we're not planning to add such support ourselves in the near future.

Craigacp commented 3 years ago

To follow up on this we've started landing ONNX export support in main (initially for Tribuo's linear models - https://github.com/oracle/tribuo/pull/154) and will expand that support across multiple model types for the upcoming v4.2 release. The initial release is likely to support a subset of Tribuo's models, and it's not likely to ever support TF or XGBoost (as those projects both have communities supporting their own ONNX converters and Tribuo can export models in formats that those converters can parse). We'll expand coverage to more of Tribuo's models in future releases.

We still have no plans for PMML, but we're open to contributions.