feiwang3311 / Lantern

BSD 3-Clause "New" or "Revised" License
168 stars 15 forks source link

Adding support for TensorFlow Scala #48

Open eaplatanios opened 5 years ago

eaplatanios commented 5 years ago

Thanks for this very interesting project! :)

I am the developer of TensorFlow Scala and I'm wondering how easy it would be to add support for TF Scala in Lantern (as a backend), and whether you'd be interested in that. I've been thinking for a while about better ways to implement the auto-differentiation, than the current approach that manually walks through the constructed TF graph. I've also been interested in better ways to handle the graph construction vs execution "modes" by maybe using staging for the graph construction part and making the API on the user-side more "imperative", similar to TF eager execution, but still benefitting from graph optimization wherever possible. I can provide more details if you're interested in this. :)

Cheers, Anthony

feiwang3311 commented 5 years ago

Hi Anthony,

Thanks for letting me know about TensorFlow Scala. It looks interesting. I just heard several days ago that someone are hoping for support to run machine learning models from JVM.

Based on my brief reading of the ReadMe in TensorFlow Scala, it servers as a Python-similar front-end (binding) of the TF API calls (which are in C++/CUDA, I believe?). That sounds a little different with Lantern. In fact, Lantern is very similar with what you described as "thinking for a while". Lantern provides a "better" (allow me to say that with maybe some bias) way to implement AD (via delimited continuations). It also use staging (specifically LMS, lightweight modular staging) to generate low level code (C++/CUDA). Lantern is indeed more imperative (like PyTorch and TF eager), and still benefits from graph optimization. You are welcome to read our NeurIPS paper about it (https://papers.nips.cc/paper/8221-backpropagation-with-callbacks-foundations-for-efficient-and-expressive-differentiable-programming) or a more PL-oriented manuscript we made available here (https://www.cs.purdue.edu/homes/rompf/papers/wang-preprint201811.pdf).

So far our next step is more towards the graph-level optimization and fast kernel generation. For that, we need a more complex IR layer for optimization and scheduling (like TVM). We can also potentially support TensorFlow as backend, but we won't use the AD machinery of TF. Nor can we directly borrow the graph-level optimization from TF. So I am not sure how easy (or meaningful) to support TensorFlow Scala in Lantern.

If your ideas are different/orthogonal to Lantern, I am surely interested to hear about it :)

Cheers, Fei