jpmml / jpmml-sparkml

Java library and command-line application for converting Apache Spark ML pipelines to PMML
GNU Affero General Public License v3.0
267 stars 80 forks source link

Support for map function within SQLTransformer #82

Closed tyers closed 4 years ago

tyers commented 5 years ago

Provide the ability to lookup keys to values within SQLTransformer by potentially providing support for the map function provided by SparkSQL.

vruusmann commented 4 years ago

Example SQL expression that would demonstrate a map-based lookup?

The catalog of Apache Spark SQL functions doesn't seem to provide any functions that take a map as an argument.

tyers commented 4 years ago

The catalog of Apache Spark SQL functions doesn't seem to provide any functions that take a map as an argument.

Seems i have misinterpreted the docs on this one and the only way to achieve this would be via a UDF of some kind.

vruusmann commented 4 years ago

the only way to achieve this would be via a UDF of some kind

Got the same impression. Right now, you can perform category re-mapping using the CASE-WHEN statement.

The JPMML-SparkML does not provide any custom transformer classes. Once there's a precedent, it would be possible to expand the framework into other areas as well, such as providing custom UDFs for map-based functionality.