tensorflow / java

Java bindings for TensorFlow
Apache License 2.0
800 stars 197 forks source link

Possibility to deploy tensorflow-java to AppEngine #400

Open kirillgroshkov opened 2 years ago

kirillgroshkov commented 2 years ago

I was trying to run ML Inference with tensorflow-java deployed to AppEngine, but quickly failed with this error:

ERROR: (gcloud.app.deploy) Cannot upload file
 [/root/repo/target/appengine-staging/WEB-INF/lib/tensorflow-core-api-0.3.3-linux-x86_64.jar], 
 which has size [62305961] (greater than maximum allowed size of [33554432]).

Which means that AppEngine forbids to deploy any single file that's bigger than 32Mb. And the jar for linux-x86_64 is ~62Mb right now. It is confirmed there: https://cloud.google.com/appengine/quotas#Code

Just checking if someone has any advice/workaround in mind?

Like, is it possible to maybe do a custom-build (I only need to run Inference of SavedModelBundle, nothing else) of tensorflow-java for Linux, to fit in under 32Mb? Or, split the jar into multiple sub-32Mb files?

Not looking for AppEngine advice (as this repo is not a right place for it), but for something that can be done with this library itself. 🙏

rnett commented 2 years ago

Try making a fat jar and running it through proguard (for the shrinker). I don't know what rules you would need, it's probably going to be trial and error. If/once you get a working set we should document it here, and there might be a way to distribute them in the jars.

saudet commented 2 years ago

If your models can be exported to TF Lite, you might want to consider using that: https://github.com/bytedeco/javacpp-presets/tree/master/tensorflow-lite That's less than 3 MB per platform.

kirillgroshkov commented 2 years ago

If your models can be exported to TF Lite, you might want to consider using that: https://github.com/bytedeco/javacpp-presets/tree/master/tensorflow-lite That's less than 3 MB per platform.

Oh that's a cool idea! I'll explore that! 👍

karllessard commented 2 years ago

Our uncompressed TensorFlow library alone is 375M, which is slightly bigger than the one used for Python (312M), I think the difference is because we also export the C++ client API is ours. Still, I have no clue how we can distribute a library that big in a smaller jar...

karllessard commented 2 years ago

I only need to run Inference of SavedModelBundle, nothing else

Even if you only do inference, you need all the TensorFlow ops and kernels to be present in your library in case your graph is using one of them. I haven't checked but I suspect that most of the bytes of this library are taken by the ops (over 1K).

rnett commented 2 years ago

Another option to consider is XLA AOT. I'm not entirely sure how you would use that from Java but you should be able to export the function and work from there.