Open demetsude opened 6 years ago
Thanks for raising this issue! Based on the context and error message, we think that https://github.com/databricks/spark-deep-learning/pull/125 should fix it. If you need an immediate work around, you might have to ensure that the column represented by labelCol
Param of your estimator ("categoryVector" ?) is a DenseVector by applying some custom udf to it.
Hi, I am using sparkdl module from databricks. I am trying to run an application using KerasImageFileEstimator. I am using the example explained in the keras_image_file_estimator.py which creates a dataset by stringIndexer = StringIndexer(inputCol="imageLabel", outputCol="categoryIndex") indexed_dateset = stringIndexer.fit(original_dataset).transform(original_dataset) encoder = OneHotEncoder(inputCol="categoryIndex", outputCol="categoryVec") image_dataset = encoder.transform(indexed_dateset) I am getting error when I run transformers = estimator.fit(image_dataset) and the error is _keras_label = row[label_col].array AttributeError: 'SparseVector' object has no attribute 'array' As far as I understand, the problem is OneHotEncoder returns a SparseVector (categoryVec) and SparseVector which is row[label_col] here does not have an attribute called array. Error raised from the _getNumpyFeaturesAndLabels function in keras_image_file_estimator.py.
I could not find a solution to this. So if you can help me, I would be glad.