jpmml / jpmml-sparkml

Java library and command-line application for converting Apache Spark ML pipelines to PMML
GNU Affero General Public License v3.0
267 stars 80 forks source link

Support for custom Java-backed models (eg. factorization machine) #122

Open lcx517 opened 2 years ago

lcx517 commented 2 years ago

Hi, Spark ML support FM Model in 3.x. Is there any plan to support factorization machine in Spark 3.x?

vruusmann commented 2 years ago

The PMML specification does not define a specialized model element for representing factorization machines.

AFAIK, the business logic of factorization machines cannot be effectively represented using the available PMML building blocks (eg. a chain of regression tables).

Possible workarounds:

  1. Independently design and implement a PMML-like representation for factorization models into the JPMML software project (both converter and evaluator sides).

  2. Implement as a custom Java-backed model. Simply create a subclass of org.jpmml.evaluator.java.JavaModel, and implement a factorization machine business logic in its evaluate<MiningFunction>(ValueFactory, EvaluationContext) method. For example, could simply invoke Apache Spark API methods in there.