Open bmm-2020 opened 3 years ago
I found the guide on how to create a custom transformer , I followed those steps and still the same error. I am bit unclear on registering the custom transformer part. As instructed at the end I created reference.conf in my local project but not sure if its taking effect.
Also, I found that when we override inputSchema or outputSchema only scalar types can be passed into StructType, not sure how to deal with vector col. (I have few custom transformers and some of them operates on vectors).
Any help/advise greatly appreciated! Thanks.
I downloaded the mleap library and modified it with my newly added custom transformer related code, compiled and packaged it. I now refer to this newly built mleap libraries from my code and I am able to generate the mleap bundle. Now my next road blocks are - 1) support for vector col (as mentioned above) 2) once I deploy this bundled model on aws sagemaker as an endpoint, how do I supply the modified mleap packages along with?
Any advise greatly appreciated, thanks!
Hi Team, I am trying to use MLeap to bundle my spark ML pipeline and I am using spark ML inbuild as well as custom transformers in the pipeline. It throws exception on bundle serialization. I am not sure if MLeap doesn't support custom transformer or if its some other issue. I am using spark 3.0.1 and Mleap-spark 0.16.0 , can you please advise? Really appreciate your help!
Exception in thread "main" java.util.NoSuchElementException: key not found: my_custom_transfomer at scala.collection.MapLike.default(MapLike.scala:235) at scala.collection.MapLike.default$(MapLike.scala:234) at scala.collection.AbstractMap.default(Map.scala:65) at scala.collection.MapLike.apply(MapLike.scala:144) at scala.collection.MapLike.apply$(MapLike.scala:143) at scala.collection.AbstractMap.apply(Map.scala:65) at ml.combust.bundle.BundleRegistry.opForObj(BundleRegistry.scala:102) at ml.combust.bundle.serializer.GraphSerializer.$anonfun$writeNode$1(GraphSerializer.scala:31) at scala.util.Try$.apply(Try.scala:213) at ml.combust.bundle.serializer.GraphSerializer.writeNode(GraphSerializer.scala:30) at ml.combust.bundle.serializer.GraphSerializer.$anonfun$write$2(GraphSerializer.scala:21) at scala.collection.IndexedSeqOptimized.foldLeft(IndexedSeqOptimized.scala:60) at scala.collection.IndexedSeqOptimized.foldLeft$(IndexedSeqOptimized.scala:68) at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:38) at ml.combust.bundle.serializer.GraphSerializer.write(GraphSerializer.scala:21) at org.apache.spark.ml.bundle.ops.PipelineOp$$anon$1.store(PipelineOp.scala:21) at org.apache.spark.ml.bundle.ops.PipelineOp$$anon$1.store(PipelineOp.scala:14) at ml.combust.bundle.serializer.ModelSerializer.$anonfun$write$1(ModelSerializer.scala:87) at scala.util.Try$.apply(Try.scala:213) at ml.combust.bundle.serializer.ModelSerializer.write(ModelSerializer.scala:83) at ml.combust.bundle.serializer.NodeSerializer.$anonfun$write$1(NodeSerializer.scala:85) at scala.util.Try$.apply(Try.scala:213) at ml.combust.bundle.serializer.NodeSerializer.write(NodeSerializer.scala:81) at ml.combust.bundle.serializer.BundleSerializer.$anonfun$write$1(BundleSerializer.scala:34) at scala.util.Try$.apply(Try.scala:213) at ml.combust.bundle.serializer.BundleSerializer.write(BundleSerializer.scala:29) at ml.combust.bundle.BundleWriter.save(BundleWriter.scala:34)