databricks / spark-deep-learning

Deep Learning Pipelines for Apache Spark
https://databricks.github.io/spark-deep-learning
Apache License 2.0
1.99k stars 494 forks source link

Issue in saving Keras model into DBFS folder #220

Open nareshr8 opened 4 years ago

nareshr8 commented 4 years ago

Hi team, I am using Azure Databricks and doing some pipelining using spark and model using keras and tensorflow. Recently I had to update my cluster from 5.4 to 6.2. The model failed to save since then. It fails with an error message "Operation not supported".

I reported the same to h5py team here. @danzafar was kind enough to respond suggesting to try to save in tmp directly instead of dbfs location. It worked.

He also suggested that It should work if I use tf.Keras model. I am actually using the same. Actually, If I just use h5py directly and try to save some data in DBFS location, still its failing.

Can someone help us out.

danzafar commented 4 years ago

@nareshr8 - I apologize, I thought this worked with tf.Keras but it looks like I was mistakes. If you do this using MLflow, which was developed on Databricks, it should work just fine. Thanks!

nareshr8 commented 4 years ago

@danzafar Thanks for letting know..

Juggernaut1997 commented 2 years ago

Hey, Do we have anything on this. Looks like its still not resolved, I am also facing the same issue.

Athena75 commented 2 years ago

a little hack that I found here : https://stackoverflow.com/questions/67017306/unable-to-save-keras-model-in-databricks

save locally in /tmp model.save('/tmp/model.h5')

then copy the model to DBFS dbutils.fs.cp("file:/tmp/model.h5", "dbfs:/tmp/model.h5") display(dbutils.fs.ls("/tmp/model.h5"))

copy file from DBFS and load it

dbutils.fs.cp("dbfs:/tmp/model.h5", "file:/tmp/model.h5")
from tensorflow import keras
model2 = keras.models.load_model("/tmp/model.h5")