KerasImageFileEstimator api cannot work with dataset as explained in keras_image_file_estimator.py

Hi, I am using sparkdl module from databricks. I am trying to run an application using KerasImageFileEstimator. I am using the example explained in the keras_image_file_estimator.py which creates a dataset by stringIndexer = StringIndexer(inputCol="imageLabel", outputCol="categoryIndex") indexed_dateset = stringIndexer.fit(original_dataset).transform(original_dataset) encoder = OneHotEncoder(inputCol="categoryIndex", outputCol="categoryVec") image_dataset = encoder.transform(indexed_dateset) I am getting error when I run transformers = estimator.fit(image_dataset) and the error is _keras_label = row[label_col].array AttributeError: 'SparseVector' object has no attribute 'array' As far as I understand, the problem is OneHotEncoder returns a SparseVector (categoryVec) and SparseVector which is row[label_col] here does not have an attribute called array. Error raised from the _getNumpyFeaturesAndLabels function in keras_image_file_estimator.py.

I could not find a solution to this. So if you can help me, I would be glad.

databricks / spark-deep-learning

KerasImageFileEstimator api cannot work with dataset as explained in keras_image_file_estimator.py #107