Easiest way to serve ONNX model trained in PyTorch

isaacmg commented 5 years ago

Hello, I was wondering if you had an example of how to serve a ONNX model trained in Python in Scala. Basically I'm looking to deploy a number of models trained in PyTorch and exported to ONNX as quickly possible for use in a Flink streaming application. I wanted to see if you had just a simple example of taking a pretrained ONNX model and performing predictions without all the additional training code. Also, is it possible to simply load the model with Lantern for inference without having to redefine the entire architecture in Scala? Thank you.

feiwang3311 commented 5 years ago

Absolutely. I am glad that this repo has caught more attention from the community! I just added something to facilitate your request but there are still more complications. Let me get those clear first.

First of all, Lantern is not usable in Scala. It may sound strange but we use Scala to generate low level code (C++ or CUDA). We can directly use the generated low level code (compiling them, linking them) but we do not yet provide ways to use the generate code from Scala. You have to (at least for now) use them as C++/CUDA code.

Secondly, our support for ONNX ops is limited. So far we only have supported squeezenet and resnet50 for our paper. More engineering effort is needed for other ops. However, we are interested in adding more supports. So if you feel like it, you can share the model with us and we will work to get that model working in Lantern.

For a running example, please refer to https://github.com/feiwang3311/Lantern/blob/master/src/main/scala/lantern/GenerateLibraryAPP/GenerateONNX.scala#L7 which shows how to generate library functions from ONNX model.

The way to use it is: git clone https://github.com/feiwang3311/Lantern.git cd Lantern sbt

run $dirToONNXModel $dirToGeneratedLib $filenameOfGeneratedLib $nameOfGeneratedFunction pick the correct Main function to run (which should be 1)

It should create a dir (dirToGeneratedLib), in which you can see 2 files, a filenameOfGeneratedLib.cpp and a filenameOfGeneratedLib.h. You can then write other C++ code to compile and like the function. If you need to generate CUDA code that runs on GPU, I can add that support too (it is not yet in the repo).

Let me know what you think about the response, and if you have problems running the code. We are interested in determining priorities for our limited engineering effort too, so we would like to hear more feedback, and potentially (if you think you still can use Lantern after you see the complications) work together for your specific case.

isaacmg commented 5 years ago

Thanks for your follow-up. I unfortunately don't think that this will work. My current problem is that I need to serve ONNX models originally exported from PyTorch inside a Flink pipeline (a large distributed stream processing engine written in Java/Scala). I'm currently using Java Embedded Python (JEP) but that is producing problems with respect to shared libraries. I'm really looking for a way to load the exported ONNX model natively into Java/Scala or another JVM language and run it without resorting to external REST APIs or embedding Python in Java like JEP. I've found it hard to find a good solution to this problem, which I find odd given the popularity of big data frameworks in Java/Scala and the number of production applications still written in Java.

feiwang3311 commented 5 years ago

Sorry about that. I totally agree with you on that issue. The ML community is overly centered on Python environment. Your situation poses an interesting challenge for me to see if we can actually run Lantern from Scala. I will let you know if I dig out more about it.

feiwang3311 commented 5 years ago

Not sure if you have found a solution to your problem, but I just learned that TensorFlow has Scala frontend (https://github.com/eaplatanios/tensorflow_scala). Could this help you?

isaacmg commented 5 years ago

Thanks for the suggestion, but I'm primarily looking at serving PyTorch models and not Tensorflow ones. That is why I'm exporting to ONNX.

TiarkRompf commented 5 years ago

goo

feiwang3311 / Lantern

Easiest way to serve ONNX model trained in PyTorch #47