KonduitAI / konduit-serving

Enterprise runtime for machine learning models
https://serving.konduit.ai/
Apache License 2.0
47 stars 15 forks source link

Keras model serving: switch to official Keras/TF #219

Open AlexDBlack opened 4 years ago

AlexDBlack commented 4 years ago

Currently we use DL4J for serving Keras models. For the "old" multi-backend Keras, this is pretty robust. For TensorFlow 2.0 Keras models, they may include general TF ops (not just Keras layers) that DL4J can't load.

Not being able to reliably support all Keras models is a major problem, and needs to be rectified ASAP.

There's a few different options here.

saudet commented 4 years ago

Since we're talking about TF 2.x and breaking backward compatibility anyway, another option to consider would be to support only the SavedModel format. This is what they have standardized on for serving purposes, and works without having to deal with Python or HDF5.

agibsonccc commented 4 years ago

I would be fine with working on a subset of things and just having keras hdf5 be supported as is. @farizrahman4u this is your general area.

AlexDBlack commented 4 years ago

Update here: @farizrahman4u is going to look at the option of using our TF graph executor to support ops that DL4J doesn't support. This might be a quick win (for DL4J and KS) until we have SameDiff-based Keras import.

If it's not going to be easy or reliable enough we'll switch to a Python/TF-based Keras inference engine for Konduit Serving.

SavedModel only would simplify things for us from an implementation perspective, but I don't think we should be dropping Keras support in the near term due to it being a popular file format.

saudet commented 4 years ago

We don't need to "drop" support for Keras. We can still use Keras in the background to export models to SavedModel, and use those at inference time.

farizrahman4u commented 4 years ago

@saudet +1, I am also against dropping any existing functionality.

AlexDBlack commented 4 years ago

Seamless (from the user's perspective) Keras to SavedModel conversion is an option, yes. We're already doing exactly that sort of thing in KS with DL4J Keras import.

But by the time we load it up in TF to convert it, we're 95% of the way there to doing inference directly in Python though. So the first step (python env + load into TF) is the same for conversion to SavedModel vs. python-based inference.

saudet commented 4 years ago

Not necessarily. We can do the conversion in another process, outside the JVM, so they are advantages.