breandan / kotlingrad

🧩 Shape-Safe Symbolic Differentiation with Algebraic Data Types
https://breandan.net/public/masters_thesis.pdf#page=49
Apache License 2.0
530 stars 21 forks source link

Possible collaboration with Facebook? #27

Open LifeIsStrange opened 2 years ago

LifeIsStrange commented 2 years ago

@breandan Your project seems to be a true state of the art autodiff library! Facebook worked on bringing autodifferentiation to Kotlin last year: https://ai.facebook.com/blog/paving-the-way-for-software-20-with-kotlin/ 1) Any news on what happened? They don't seem to have released a library yet. 2) it would be really nice if you joined forces, they might hire you and your library has a lot of technical merits, from quick look Kotlingrad might be the "best" autodiff library out there (although extensive benchmarks vs e.g JAX/XLA are missing) Such a collaboration could help bring traction toward Kotlin for machine learning, especially if Facebook made the revolutionnarily disruptive decision to fund GraalPython [0]

[0] Kotlin is interopperable with Python through GraalPython https://github.com/oracle/graalpython finally you might find this blog interesting http://www.stochasticlifestyle.com/engineering-trade-offs-in-automatic-differentiation-from-tensorflow-and-pytorch-to-jax-and-julia/ (although you will probably not learn anything new from it :)

unrelated but I wonder what Ndarray implementation do you use? ND4J is by far the implementation with the most human resources, and it can easily use optimized backends such as openBLAS or better: Intel MKL https://github.com/eclipse/deeplearning4j/tree/master/nd4j

finally, a long term idea for KotlinGrad might be to develop a compiler plugin. For easing this process, arrow meta can be used https://github.com/arrow-kt/arrow-meta

breandan commented 2 years ago

Hi @LifeIsStrange, thank you for your kind words! While this project is currently a research prototype, we recently stabilized support for Kotlin multiplatform, which takes us one step closer towards broader usability. We are reluctant to make claims about its superiority, however Kotlin∇ does innovate in some areas previously overlooked by mainstream AD frameworks. Our primary innovation is bringing compile-time shape safety to differentiable programs, allowing users to track tensor shapes at compile time in vanilla Kotlin, with no custom IDE plugins or compiler engineering required. This is achieved by staged metaprogramming to an embedded DSL in pure Kotlin.

As we realized developing Kotlin∇, it is possible to encode certain typelevel operations on the JVM. While Java does admit more general languages [Grigore (2013)], opening to door to typelevel programming, this presents a number of challenges as noted by Amin & Tate (2016). The fragment upon which Kotlin's type system is based (nominal subtyping with declaration-site variance) is believed to be decidable following Kennedy and Pierce (2006) and later adapted by Tate (2013). This result does not preclude dependently typed programming, but significantly inhibits its expressiveness. It is known that that nominal subtyping with variance can be used to recognize various tree languages [Roth (2021)], a technique we may further leverage to encode bounded type-level arithmetic on array programs.

While feasible in theory, the current implementation of the K2 compiler has practical limitations, some of which have been reported to the Kotlin team (e.g., KT-30040, KT-50466, KT-50533, KT-50553, KT-50617). If addressed, we believe typelevel programming in Kotlin will be more practical. Regardless, Kotlin∇ is strongly committed to following the Kotlin language specification [Akhin & Belyaev (2021)] to ensure interoperability with other JVM languages. From our discussions with @headinthebox, I believe @facebook is pursuing a different roadmap. While their DSL appears theoretically sound, we are skeptical that forking the Kotlin language is the best approach. Nevertheless, Kotlin∇ welcomes anyone who is interested in collaborating. We feel that native support for machine learning will be a healthy addition for the Kotlin ecosystem and are eager to see where it leads.

In terms of its automatic differentiation features, Kotlin∇ is heavily inspired by prior work, in particular, Theano [Bergstra, 2010], Myia [Breleux, et al. (2017)], Elliott (2018), Hasktorch [Huang et al. (2021)], Certigrad [Selsam et al. (2017)] and others. Contrary to prior literature in automatic differentiation, we believe that AD and SD are equivalent and make no distinction between them. This allows us to take higher order and higher rank derivatives on well-defined symbolic expressions. Further discussion about its design and features may be found in Considine et al. (2019) and Considine (2020).

From a compiler standpoint, Kotlin∇ views the compilation of array programs as semiring linear algebra [Considine, (2020)], and towards this end, we are prototyping a custom IR (cf., #11, #13) for graph compilation. Leveraging hard results in parallel computing and arithmetic circuit complexity, we believe it is feasible to simultaneously realize parallelizability upper bounds [Amdahl (1967)] and circuit lower bounds [Nisan & Wigderson (1997)]. While an ambitious goal, we are convinced this approach to lowering array programs will ultimately outperform ad hoc compiler optimizations. As you suggested, it would be helpful to provide more extensive benchmarks to track our progress with respect to JAX/XLA.

Kotlin is interopperable with Python through GraalPython

Interesting observation! Are you suggesting we use Python as a backend or frontend? We currently compile on Graal and are open to staging into Python. A simple PoC would be to emit Python code via runtime translation from Kotlin∇ expressions to Python. This is similar to a test we currently have, where we generate and evaluate a random symbolic expression in pure Kotlin. We could take a similar approach to stage into Python. Regarding the front end, Python is starting to incorporate dependent types (cf. python/mypy#3062) and @deepmind recently announced a similar project for tracking tensor shapes at compile time. There is potentially room for Python→Kotlin∇ interop as well, although the use case for this direction is less clear until Kotlin becomes more competitive with the Python ecosystem.

what Ndarray implementation do you use?

Since we recently ported to Kotlin multiplatform as of 0.4.7, Kotlin∇ currently uses its own custom NDArray implementation, but can be easily switched back to MultiK (pending Kotlin/multik#8) or Viktor (pending JetBrains-Research/viktor#50). We are experimenting with a typeclass-based API which supports ad hoc polymorphism, so in the future it will be possible for users to swap in any NDArray implementation of their choice using just a few lines of code. Previously, we also considered EJML-Kotlin, tensor, and a few other libraries however these are JVM-only at this time.