jpmml / jpmml-sparkml

Java library and command-line application for converting Apache Spark ML pipelines to PMML
GNU Affero General Public License v3.0
267 stars 80 forks source link

Databricks Install #134

Closed zwag20 closed 1 year ago

zwag20 commented 1 year ago

Is there a way to install this in databricks? I tried to install through the maven repository and get the following error message:

java.lang.RuntimeException: ManagedLibraryInstallFailed: Library resolution failed because org.jpmml:jpmml-sparkml download failed. for library:org/jpmml/jpmml-sparkml#2.4.0.jar,isSharedLibrary=false

The older versions used to have jar files I could download, but the most recent versions don't seem to have jar files available.

vruusmann commented 1 year ago

ManagedLibraryInstallFailed: Library resolution failed because org.jpmml:jpmml-sparkml download failed. for library:org/https://github.com/jpmml/jpmml-sparkml/issues/2.4.0.jar,isSharedLibrary=false

The JPMML-SparkML project was modularized when upgrading from 1.X to 2.X.

In 2.X, the Apache Maven coordinates of the main library module are org.jpmml:pmml-sparkml:${version}.

Please note that the second component (aka the artifactId) does not have a "j" prefix to it anymore. So, it's pmml-sparkml, not jpmml-sparkml.

The older versions used to have jar files I could download, but the most recent versions don't seem to have jar files available.

You're trying to download the project descriptor aka POM right now. A POM is an XML file, it does not have any JAR files.

You should be loading the main library module instead.

All library modules have JAR files available in the Maven Central (MC) repository. During release, MC runs extensive checks to make sure that all the requires files are available and correct. It's technically impossible to make an incomplete or invalid MC submission.

The older versions used to have jar files I could download

TLDR: This is a new version now (the 2.X development branch), so module coordinates have changed.

Please use updated coordinates, and all will be fine (with or without Databricks).

When in doubt, copy&paste from the README file: https://github.com/jpmml/jpmml-sparkml#library