deepgram / kur

Descriptive Deep Learning
Apache License 2.0
814 stars 107 forks source link

Any desirable strength kur is lack compared to keras and pytorch? #75

Closed EmbraceLife closed 7 years ago

EmbraceLife commented 7 years ago

Kur wraps keras and pytorch, is there any strength or advantages of keras or pytorch that kur can not get by wrapping them?

ajsyp commented 7 years ago

In theory, we can write code for Kur that does anything Keras/PyTorch does. Generally, wrapping requires subscribing to one of two philosophies: least common denominator (only support the subset of features that all backends support), and having non-compatible options (such as exposing features in one backend that don't necessarily make sense for other backends, as with the odd receptive field for PyTorch convolutions). I prefer the non-compatible approach for Kur, so that we can take advantage of unique, backend-specific features where available. If we didn't do this, then no: we wouldn't be able to do everything that Keras/PyTorch doesn because, by definition, we are only doing things that they both support.

That's all fine in theory, but in practice, we won't do everything that the backends support. We are time-, talent-, and community-limited in what we can support. If PyTorch offers feature X, then somebody needs to expose that feature via Kur (if it hasn't already been exposed somehow). If nobody has the time or interest in doing so, then it won't make it into Kur.

In that vein, there are low-level operations that are likely going to be too incredibly difficult to figure out how to expose via a high-level API. For example, let's say that researcher R has been playing with a weird tensor operations that does convolutions over a Fibonacci sequence of pixels (i.e., pixels 1, 1, 2, 3, 5, 8, ...). Why would they do that? I have no idea. But implementing that was probably really hard for the researcher. How are we going to get that into Kur? Again, somebody might have the energy and interest in doing it, and that's great. What if they wrote it in TensorFlow? Well, it probably isn't in the Keras backend, either (so that not even the Keras community would be able to use that operation). Even if you wrote Keras-compatible code in pure TensorFlow, it is probably too much work for most people to care about supporting.

So Kur is a community-defined compromise between "what are the desireable and useful features that we want to be able to use" and "what incredibly low-level operations do we want to spend the time working on". In theory, we could implement anything. In practice, contributors will spend time supporting features that they want/need for their high-level approach for deep learning.

(Aside: this compromise is everywhere in the world, and not just in software, either.)