combust / mleap

MLeap: Deploy ML Pipelines to Production
https://combust.github.io/mleap-docs/
Apache License 2.0
1.51k stars 313 forks source link

Cannot load Bundle in java assembly jar: object scala.Predef not found #638

Open teichmaj opened 4 years ago

teichmaj commented 4 years ago

I am trying to create a KSQL UDF serving a Mleap model. The code for a toy example can be found here: https://gitlab.com/jan-teichmann/ksql-iris-classifier-udf

I am loading a pipeline from a BundleFile following the documentation with

val pipeline = (for(bundle <- managed(BundleFile(path))) yield {
            bundle.loadMleapBundle().get
        }).tried.get.root

I have a unit test which confirms that the pipeline works as expected and sbt test runs successfully. I then create an uber jar with the sbt assembly plugin and put that in the extension folder of KSQL which then loads and tries to instantiate the UDF. This fails with scala.ScalaReflectionException: object scala.Predef not found.

ksql-server_1      | Caused by: java.lang.reflect.InvocationTargetException
ksql-server_1      |    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
ksql-server_1      |    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
ksql-server_1      |    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
ksql-server_1      |    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
ksql-server_1      |    at ml.combust.mleap.bundle.ops.MleapOp.load(MleapOp.scala:24)
ksql-server_1      |    at ml.combust.mleap.bundle.ops.MleapOp.load(MleapOp.scala:16)
ksql-server_1      |    at ml.combust.bundle.serializer.NodeSerializer$$anonfun$read$2$$anonfun$apply$3$$anonfun$apply$4.apply(NodeSerializer.scala:106)
ksql-server_1      |    at scala.util.Try$.apply(Try.scala:192)
ksql-server_1      |    at ml.combust.bundle.serializer.NodeSerializer$$anonfun$read$2$$anonfun$apply$3.apply(NodeSerializer.scala:104)
ksql-server_1      |    at ml.combust.bundle.serializer.NodeSerializer$$anonfun$read$2$$anonfun$apply$3.apply(NodeSerializer.scala:102)
ksql-server_1      |    at scala.util.Success.flatMap(Try.scala:231)
ksql-server_1      |    at ml.combust.bundle.serializer.NodeSerializer$$anonfun$read$2.apply(NodeSerializer.scala:102)
ksql-server_1      |    at ml.combust.bundle.serializer.NodeSerializer$$anonfun$read$2.apply(NodeSerializer.scala:101)
ksql-server_1      |    at scala.util.Success.flatMap(Try.scala:231)
ksql-server_1      |    at ml.combust.bundle.serializer.NodeSerializer.read(NodeSerializer.scala:100)
ksql-server_1      |    at ml.combust.bundle.serializer.GraphSerializer$$anonfun$readNode$2.apply(GraphSerializer.scala:57)
ksql-server_1      |    at ml.combust.bundle.serializer.GraphSerializer$$anonfun$readNode$2.apply(GraphSerializer.scala:57)
ksql-server_1      |    at scala.util.Success.flatMap(Try.scala:231)
ksql-server_1      |    at ml.combust.bundle.serializer.GraphSerializer.readNode(GraphSerializer.scala:56)
ksql-server_1      |    at ml.combust.bundle.serializer.GraphSerializer$$anonfun$read$1.apply(GraphSerializer.scala:44)
ksql-server_1      |    at ml.combust.bundle.serializer.GraphSerializer$$anonfun$read$1.apply(GraphSerializer.scala:44)
ksql-server_1      |    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
ksql-server_1      |    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
ksql-server_1      |    at scala.collection.immutable.List.foreach(List.scala:381)
ksql-server_1      |    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
ksql-server_1      |    at scala.collection.immutable.List.map(List.scala:285)
ksql-server_1      |    at ml.combust.bundle.serializer.GraphSerializer.read(GraphSerializer.scala:44)
ksql-server_1      |    at ml.combust.mleap.bundle.ops.PipelineOp$$anon$1.load(PipelineOp.scala:28)
ksql-server_1      |    at ml.combust.mleap.bundle.ops.PipelineOp$$anon$1.load(PipelineOp.scala:15)
ksql-server_1      |    at ml.combust.bundle.serializer.ModelSerializer$$anonfun$readWithModel$2.apply(ModelSerializer.scala:106)
ksql-server_1      |    at ml.combust.bundle.serializer.ModelSerializer$$anonfun$readWithModel$2.apply(ModelSerializer.scala:104)
ksql-server_1      |    at scala.util.Success$$anonfun$map$1.apply(Try.scala:237)
ksql-server_1      |    at scala.util.Try$.apply(Try.scala:192)
ksql-server_1      |    at scala.util.Success.map(Try.scala:237)
ksql-server_1      |    at ml.combust.bundle.serializer.ModelSerializer.readWithModel(ModelSerializer.scala:103)
ksql-server_1      |    at ml.combust.bundle.serializer.NodeSerializer$$anonfun$read$2.apply(NodeSerializer.scala:102)
ksql-server_1      |    at ml.combust.bundle.serializer.NodeSerializer$$anonfun$read$2.apply(NodeSerializer.scala:101)
ksql-server_1      |    at scala.util.Success.flatMap(Try.scala:231)
ksql-server_1      |    at ml.combust.bundle.serializer.NodeSerializer.read(NodeSerializer.scala:100)
ksql-server_1      |    at ml.combust.bundle.serializer.BundleSerializer$$anonfun$read$2.apply(BundleSerializer.scala:55)
ksql-server_1      |    at ml.combust.bundle.serializer.BundleSerializer$$anonfun$read$2.apply(BundleSerializer.scala:49)
ksql-server_1      |    at scala.util.Success.flatMap(Try.scala:231)
ksql-server_1      |    at ml.combust.bundle.serializer.BundleSerializer.read(BundleSerializer.scala:49)
ksql-server_1      |    at ml.combust.bundle.BundleFile.load(BundleFile.scala:123)
ksql-server_1      |    at ml.combust.mleap.runtime.MleapSupport$MleapBundleFileOps.loadMleapBundle(MleapSupport.scala:25)
ksql-server_1      |    at ksql.irisudf.Iris$$anonfun$2.apply(IrisClassifier.scala:27)
ksql-server_1      |    at ksql.irisudf.Iris$$anonfun$2.apply(IrisClassifier.scala:26)
ksql-server_1      |    at resource.AbstractManagedResource$$anonfun$5.apply(AbstractManagedResource.scala:88)
ksql-server_1      |    at scala.util.control.Exception$Catch$$anonfun$either$1.apply(Exception.scala:125)
ksql-server_1      |    at scala.util.control.Exception$Catch$$anonfun$either$1.apply(Exception.scala:125)
ksql-server_1      |    at scala.util.control.Exception$Catch.apply(Exception.scala:103)
ksql-server_1      |    at scala.util.control.Exception$Catch.either(Exception.scala:125)
ksql-server_1      |    at resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88)
ksql-server_1      |    at resource.ManagedResourceOperations$class.apply(ManagedResourceOperations.scala:26)
ksql-server_1      |    at resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50)
ksql-server_1      |    at resource.DeferredExtractableManagedResource$$anonfun$tried$1.apply(AbstractManagedResource.scala:33)
ksql-server_1      |    at scala.util.Try$.apply(Try.scala:192)
ksql-server_1      |    at resource.DeferredExtractableManagedResource.tried(AbstractManagedResource.scala:33)
ksql-server_1      |    at ksql.irisudf.Iris.<init>(IrisClassifier.scala:29)
ksql-server_1      |    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
ksql-server_1      |    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
ksql-server_1      |    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
ksql-server_1      |    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
ksql-server_1      |    at java.lang.Class.newInstance(Class.java:442)
ksql-server_1      |    at io.confluent.ksql.function.FunctionLoaderUtils.instantiateFunctionInstance(FunctionLoaderUtils.java:106)
ksql-server_1      |    ... 27 more
ksql-server_1      | Caused by: scala.ScalaReflectionException: object scala.Predef not found.
ksql-server_1      |    at scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:162)
ksql-server_1      |    at scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:22)
ksql-server_1      |    at ml.combust.mleap.runtime.transformer.feature.ReverseStringIndexer$$typecreator2$1.apply(ReverseStringIndexer.scala:16)
ksql-server_1      |    at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:232)
ksql-server_1      |    at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:232)
ksql-server_1      |    at ml.combust.mleap.core.reflection.MleapReflection$class.mirrorType(MleapReflection.scala:75)
ksql-server_1      |    at ml.combust.mleap.core.reflection.MleapReflection$class.typeSpec(MleapReflection.scala:18)
ksql-server_1      |    at ml.combust.mleap.core.reflection.MleapReflection$.typeSpec(MleapReflection.scala:126)
ksql-server_1      |    at ml.combust.mleap.runtime.function.UserDefinedFunction$.function1(UserDefinedFunction.scala:42)
ksql-server_1      |    at ml.combust.mleap.runtime.transformer.feature.ReverseStringIndexer.<init>(ReverseStringIndexer.scala:16)
ksql-server_1      |    ... 92 more

Has anyone an idea what might go wrong here?

sllynn commented 4 years ago

Hey Jan, I took a look at your example and compared it to our reference here.

Where you read the bundle, what happens if you substitute jar: for file:? Given your model resources are a folder rather than a .zip I'd expect you'd need the file: prefix.

teichmaj commented 4 years ago

Hi Stuart, thanks for having a look at this. I build the project with build.sbt and exportJars := true and the resource folder ends up in the jar. The unit test works fine and the bundle file is being read with the current code. Using file: throws a NullPointerException

ancasarb commented 4 years ago

@teichmaj taking a look, will keep you posted.

teichmaj commented 4 years ago

I falsely thought I fixed it with switching versions to scala 2.12.10 and mleap 0.14 Still the same error when KSQL tries to init the UDF.