combust / mleap

MLeap: Deploy ML Pipelines to Production
https://combust.github.io/mleap-docs/
Apache License 2.0
1.5k stars 312 forks source link

Object is missing required member 'timestamp' #314

Closed polya20 closed 6 years ago

polya20 commented 6 years ago

I am trying to deploy the airbnb regression notebook created model but it fails with Object is missing required member 'timestamp' docker run -p 65327:65327 -v /tmp/models:/models combustml/mleap-serving:0.8.1-SNAPSHOT [ERROR] [12/18/2017 11:19:23.550] [MleapServing-akka.actor.default-dispatcher-7] [MleapResource] error with request spray.json.DeserializationException: Object is missing required member 'timestamp' at spray.json.package$.deserializationError(package.scala:23) at spray.json.ProductFormats$class.fromField(ProductFormats.scala:60) at spray.json.DefaultJsonProtocol$.fromField(DefaultJsonProtocol.scala:30) at spray.json.ProductFormatsInstances$$anon$5.read(ProductFormatsInstances.scala:134) at spray.json.ProductFormatsInstances$$anon$5.read(ProductFormatsInstances.scala:118) at spray.json.JsValue.convertTo(JsValue.scala:31) at ml.combust.bundle.BundleFile$$anonfun$readInfo$1.apply(BundleFile.scala:59) at ml.combust.bundle.BundleFile$$anonfun$readInfo$1.apply(BundleFile.scala:59) at scala.util.Try$.apply(Try.scala:192) at ml.combust.bundle.BundleFile.readInfo(BundleFile.scala:59) at ml.combust.bundle.serializer.BundleSerializer.read(BundleSerializer.scala:49) at ml.combust.bundle.BundleFile.load(BundleFile.scala:84) at ml.combust.mleap.runtime.MleapSupport$MleapBundleFileOps.loadMleapBundle(MleapSupport.scala:22) at ml.combust.mleap.serving.MleapService$$anonfun$loadModel$2$$anonfun$apply$2.apply(MleapService.scala:29) at ml.combust.mleap.serving.MleapService$$anonfun$loadModel$2$$anonfun$apply$2.apply(MleapService.scala:28) at resource.AbstractManagedResource$$anonfun$5.apply(AbstractManagedResource.scala:88) at scala.util.control.Exception$Catch$$anonfun$either$1.apply(Exception.scala:125) at scala.util.control.Exception$Catch$$anonfun$either$1.apply(Exception.scala:125) at scala.util.control.Exception$Catch.apply(Exception.scala:103) at scala.util.control.Exception$Catch.either(Exception.scala:125) at resource.AbstractManagedResource.acquireFor(AbstractManagedResource.scala:88) at resource.ManagedResourceOperations$class.apply(ManagedResourceOperations.scala:26) at resource.AbstractManagedResource.apply(AbstractManagedResource.scala:50) at resource.DeferredExtractableManagedResource$$anonfun$tried$1.apply(AbstractManagedResource.scala:33) at scala.util.Try$.apply(Try.scala:192) at resource.DeferredExtractableManagedResource.tried(AbstractManagedResource.scala:33) at ml.combust.mleap.serving.MleapService$$anonfun$loadModel$2.apply(MleapService.scala:30) at ml.combust.mleap.serving.MleapService$$anonfun$loadModel$2.apply(MleapService.scala:30) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.util.NoSuchElementException: key not found: timestamp

Apache toree config { "language": "scala", "display_name": "Apache Toree - Scala", "env": { "TOREE_SPARK_OPTS": "--packages com.databricks:spark-avro_2.11:3.0.1,ml.combust.mleap:mleap-spark_2.11:0.5.0,com.typesafe.akka:akka-actor_2.11:2.4.16,com.typesafe.akka:akka-stream_2.11:2.4.16", "SPARK_HOME": "/usr/local/spark", "__TOREE_OPTS__": "", "DEFAULT_INTERPRETER": "Scala", "PYTHONPATH": "/usr/local/spark/python:/usr/local/spark/python/lib/py4j-0.10.4-src.zip", "PYTHON_EXEC": "python" }, "argv": [ "/usr/local/share/jupyter/kernels/apache_toree_scala/bin/run.sh", "--profile", "{connection_file}" ] }

The notebook doesn't find the "import ml.combust.model.client.spark.SparkSupport._"

polya20 commented 6 years ago

spark version - 2.0.0 scala version - 2.11.2

ancasarb commented 6 years ago

What version of MLeap did you use to serialize the bundle? There were some breaking changes from 0.7.0 to 0.8.0 (or 0.8.1), and looking at the error, it sounds like you've used 0.7.0 for serializing.

polya20 commented 6 years ago

ml.combust.mleap:mleap-spark_2.11:0.5.0. Going by the maven repository - https://mvnrepository.com/artifact/ml.combust.mleap/mleap-spark_2.11/0.5.0

(update: while using the mleap-serving:0.7.0-SNAPSHOT, models serving went fine). What should be the right Toree config and which version of Spark and Scala should be used. It would be great if the documentation is updated.

ancasarb commented 6 years ago

@polya20 You should generally try to match the version of MLeap that is used for serialization and the one that is used for serving, as sometimes, breaking changes are introduced between versions (backwards compatibility to come in the future with version 1.0.0). At the moment, the latest MLeap version is 0.9.0. This works with both Scala 2.10 and 2.11 and with Spark, versions, 2.0, 2.1 and 2.2.

Hope this helps! Let me know if you need more help getting set up!

hollinwilkins commented 6 years ago

Closing this issue for now. Backwards compatibility is fairly stable for serialization at this point.