FlinkML / flink-jpmml

flink-jpmml is a fresh-made library for dynamic real time machine learning predictions built on top of PMML standard models and Apache Flink streaming engine
GNU Affero General Public License v3.0
96 stars 30 forks source link

Handle additional dependencies #62

Open Harish-Sridhar opened 5 years ago

Harish-Sridhar commented 5 years ago

Hi,

In the flink-jpmml documentation, it is mentioned that the dependency to be added to projects is as follows.

"io.radicalbit" %% "flink-jpmml-scala" % "0.6.3"

But when I tried to build a simple example using the IRIS datasource, I figured that there needs to be additional dependencies for the project such as the following

"org.jpmml" % "pmml-evaluator" % "1.3.9",
 "org.glassfish.jaxb" % "jaxb-runtime" % "2.3.2"

I wanted to suggest to either add these additional manadatory dependencies to the library"io.radicalbit" %% "flink-jpmml-scala" % "0.6.3" or mention it alteast in README as a note under dependency section.

Regards, Harish.

francescofrontera commented 5 years ago

Hi @Harish-Sridhar, could you show me your build.sbt or the dependencies file you're using?

Anyway, I tried to reproduce the error on my local machine, with this build.sbt:

ThisBuild / resolvers ++= Seq(
    "Apache Development Snapshot Repository" at "https://repository.apache.org/content/repositories/snapshots/",
    Resolver.mavenLocal
)

name := "flink-with-jpmml"

version := "0.1-SNAPSHOT"

organization := "io.ffrontera"

ThisBuild / scalaVersion := "2.11.12"

val flinkVersion = "1.6.2"

val flinkDependencies = Seq(
  "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
  "org.apache.flink" %% "flink-streaming-scala" % flinkVersion % "provided",
  "io.radicalbit" %% "flink-jpmml-scala" % "0.6.3"
)

lazy val root = (project in file(".")).
  settings(
    libraryDependencies ++= flinkDependencies
  )

assembly / mainClass := Some("io.ffrontera.Job")

// make run command include the provided dependencies
Compile / run  := Defaults.runTask(Compile / fullClasspath,
                                   Compile / run / mainClass,
                                   Compile / run / runner
                                  ).evaluated

// stays inside the sbt console when we press "ctrl-c" while a Flink programme executes with "run" or "runMain"
Compile / run / fork := true
Global / cancelable := true

// exclude Scala library from assembly
assembly / assemblyOption  := (assembly / assemblyOption).value.copy(includeScala = false)

The error same not appear, when build the project.

Regards, Francesco.

spi-x-i commented 5 years ago

@Harish-Sridhar any update on this? Were you able to solve this?

Harish-Sridhar commented 5 years ago

Hi,

When I use the below SBT configuration, the build was successful, but when I run the program I was getting error about JPMML modules not existing in classpath.

name := "flink-jpmml-test"

version := "0.1"

scalaVersion := "2.11.11"

resolvers ++= Seq(
  "Radicalbit Releases" at "https://tools.radicalbit.io/artifactory/public-release/"
)

libraryDependencies ++= Seq("io.radicalbit" %% "flink-jpmml-scala" % "0.6.3",
  "org.apache.flink" %% "flink-streaming-scala" % "1.7.2",
  "org.apache.flink" %% "flink-scala" % "1.7.2"
)

But when I added the necessary dependencies it all went fine. Following build configuration worked.

name := "flink-jpmml-test"

version := "0.1"

scalaVersion := "2.11.11"

resolvers ++= Seq(
  "Radicalbit Releases" at "https://tools.radicalbit.io/artifactory/public-release/"
)

libraryDependencies ++= Seq("io.radicalbit" %% "flink-jpmml-scala" % "0.6.3",
  "org.apache.flink" %% "flink-streaming-scala" % "1.7.2",
  "org.apache.flink" %% "flink-scala" % "1.7.2",
  "org.jpmml" % "pmml-evaluator" % "1.3.9",
// I needed to provided the below dependency because I was using Java10.
 "org.glassfish.jaxb" % "jaxb-runtime" % "2.3.2" 
)
francescofrontera commented 5 years ago

HI @Harish-Sridhar, I tried to run your github project without the listed dependencies:

name := "flink-jpmml-test"

version := "0.1"

scalaVersion := "2.11.11"

resolvers ++= Seq(
  "Radicalbit Releases" at "https://tools.radicalbit.io/artifactory/public-release/",
  //"Sonatype OSS Snapshots" at "https://oss.sonatype.org/content/repositories/snapshots"
)

libraryDependencies ++= Seq("io.radicalbit" %% "flink-jpmml-scala" % "0.6.3",
  "org.apache.flink" %% "flink-streaming-scala" % "1.7.2",
  "org.apache.flink" %% "flink-scala" % "1.7.2",
  /*"org.jpmml" % "pmml-evaluator" % "1.3.9",*/
  /*"org.glassfish.jaxb" % "jaxb-runtime" % "2.3.2",*/
  "org.apache.kafka" % "kafka-clients" % "2.1.0",
  "com.storm-enroute" %% "scalameter" % "0.17"
  //, "org.slf4j" % "slf4j-simple" % "1.7.9"
)

mainClass in assembly := Some("org.hs.flink.jpmml.test.AppsDTree")

testFrameworks += new TestFramework(
  "org.scalameter.ScalaMeterFramework")

logBuffered := false

parallelExecution in Test := false

The run it's done correctly in my local machine. I recommend you to run sbt clean and reload before run the project.

Regards, Francesco.