Closed ruimaximo closed 5 years ago
hey @ruimaximo we've removed mleap's own one hot encoder, so you can use
import org.apache.spark.ml.feature.OneHotEncoderEstimator
instead.
To use the pyspark integration, you'll need to attach to your cluster the mleap pypi dependency as well. And then you can do something like this
import mleap.pyspark
from mleap.pyspark.spark_support import SimpleSparkSerializer
from pyspark.ml.feature import OneHotEncoderEstimator
Please let me know if you have any further questions, and if not, if it's okay to close this issue.
Py4JJavaError Traceback (most recent call last)
Hey, we don't have support for DeepImageFeaturizer, I could help guide you in what's required for adding support for it, if you'd like.
Yes please, I would like your help. thanks
Closing this, will open a new issue for support of DeepImageFeaturizer.
Hi,
I want to bundle a PySpark ML pipeline with MLeap. I was able to do it fine until I added pyspark.ml.feature.OneHotEncoderEstimator to my pipeline.
When I am using a cluster based on Python 3 and Databricks runtime 4.3 (Scala 2.11,Spark 2.3.1) I got the issue [#220]
As suggested in #220 I tried to import and use the mleap OneHotEncoder. However, I cannot import anything from org.apache.spark.ml.mleap.feature
`%scala import org.apache.spark.ml.mleap.feature
notebook:1: error: object feature is not a member of package org.apache.spark.ml.mleap import org.apache.spark.ml.mleap.feature ` I got the same error with Databricks 3.5 LTS (Scala 2.11, Spark 2.2.1)
Extra Info:
I am using Databricks Community edition. I have installed MLeap-Spark:
I am new to Spark and to GitHub. I might be missing something really obvious. Please go easy on me :)