keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.71k stars 19.43k forks source link

Update on Roadmap #426

Closed fchollet closed 8 years ago

fchollet commented 9 years ago

Just an update on where Keras is going in the short term.

Here are the goals for the next release:

Let me know if I'm forgetting anything important.

Regarding visualization tools: these will be developed as a standalone app separate from Keras, in order to be usable with other deep learning frameworks as well. The project is still essentially at the ideation phase. Hulaos is a first concrete step.

phreeza commented 9 years ago

Probably more long-term, but at one point you mentioned that you see keras going in a direction where it is theano-agnostic. Do you still see it that way? If so, I think some planning in that direction could be helpful, since it would probably require some kind of middleware, which gets harder to add as the codebase grows.

fchollet commented 9 years ago

Probably more long-term, but at one point you mentioned that you see keras going in a direction where it is theano-agnostic. Do you still see it that way?

Absolutely. This is the only way to ensure the long-term success of the project. Theano will not be forever. The core value that Keras brings (fast prototyping for research purposes) is not Theano-dependent, and so it should be ported to the best platform that will emerge in the future.

As for when we will start working on a switch to an abstract computation backend: as soon as a Theano successor comes along. It might be as soon as this September. We'll see.

it would probably require some kind of middleware, which gets harder to add as the codebase grows.

Yes. As much as possible the new abstract backend would follow closely the Theano API, so although the conversion would take some effort, it wouldn't be a difficult project. We'll definitely start planning for it in the next couple months.

tleeuwenburg commented 9 years ago

Having access to a non-Theano backend would significantly speed up test execution. Having the unit tests in place will greatly help with verifying the new backend, also. The tests could run the new backend by default, with a slow-running test occasionally verifying that all is well on Theano.

On 23 July 2015 at 00:46, François Chollet notifications@github.com wrote:

Probably more long-term, but at one point you mentioned that you see keras going in a direction where it is theano-agnostic. Do you still see it that way?

Absolutely. This is the only way to ensure the long-term success of the project. Theano will not be forever. The core value that Keras brings (fast prototyping for research purposes) is not Theano-dependent, and so it should be ported to the best platform that will emerge in the future.

As for when we will start working on a switch to an abstract computation backend: as soon as a Theano successor comes along. It might be as soon as this September. We'll see.

it would probably require some kind of middleware, which gets harder to add as the codebase grows.

Yes. As much as possible the new abstract backend would follow closely the Theano API, so although the conversion would take some effort, it wouldn't be a difficult project. We'll definitely start planning for it in the next couple months.

— Reply to this email directly or view it on GitHub https://github.com/fchollet/keras/issues/426#issuecomment-123746342.


Tennessee Leeuwenburg http://myownhat.blogspot.com/ "Don't believe everything you think"

futurely commented 9 years ago

What does the "Theano successor" mean exactly?

NervanaSystems's backend was the fastest according to this benchmark.

fchollet commented 9 years ago

What does the "Theano successor" mean exactly?

  • faster runtime
  • no compilation or faster compilation
  • supports autodifferentiation

NervanaSystems's backed was the fastest according to this benchmark.

For convolutions, yes. For now they don't support autodiff, though.

harpone commented 9 years ago

@fchollet:

As for when we will start working on a switch to an abstract computation backend: as soon as a Theano successor comes along. It might be as soon as this September. We'll see.

So what exactly is the successor? (I'm just struggling with theano.scan and was just sort of wondering...)

pranv commented 9 years ago

@harpone, scroll up

harpone commented 9 years ago

scrolling, scrolling, scrolling...

pranv commented 9 years ago

What does the "Theano successor" mean exactly?

  • faster runtime
  • no compilation or faster compilation
  • supports autodifferentiation
harpone commented 9 years ago

@pranv yes, I can read. That doesn't answer my question, but nevermind...

elanmart commented 9 years ago

http://joschu.github.io/ ?

pranv commented 9 years ago

@elanmart, looks great! But no proper GPU support yet. And the code base is very young. But definitely a strong candidate for alternative backend.

fchollet commented 9 years ago

@elanmart It looks very nice, we will definitely be keeping tabs on it. Real GPU support is apparently coming soon.

Once GPU support is here and once the library has been a little bit battle-tested, we'll consider whether to port Keras to CGT.

futurely commented 9 years ago

Nervana is still the fastest. https://github.com/soumith/convnet-benchmarks/issues/56 https://github.com/soumith/convnet-benchmarks

scott-gray commented 9 years ago

We now have working autodiff in the new neon/nervanagpu code base. It's not quite as slick yet as CGT but we're taking a close look at that and see no reason why the next version can't be just as functional.

pranv commented 9 years ago

Many options for keras backend. Future looking good...!

great for python ecosystem too

fchollet commented 9 years ago

@scott-gray Awesome, that's exciting! Do you have some documentation and code examples for the autodiff feature? How would you compare the Neon computation backend to Theano in terms of feature coverage?

I know of 3 potential options for a post-Theano Keras backend, and it looks like we will probably pick one within the next 2-3 months. It's great to see that things are moving fast in the Python-tensor space : )

scott-gray commented 9 years ago

I'd hold out another week or so for the fully refactored neon release. The documentation should be fairly complete.

I think I'm covering most everything that theano can do in my backend. The new conv kernels use a dimshuffle on the filter to speed things up. I still need to generalize that code for full transpose(axes=()) support, but that won't be too hard.

I also want to fully support any dimension of tensor in reduction and broadcast operations (ew already works) but that might take me a bit of time to get to. Right now those things just work with 1 or 2.

At the moment, I'm just fleshing out all my conv kernel tiling sizes so performance will be good for a wide range of params. Trying to make it as hard as possible for the FFT guys to overtake me.. but I suppose it might be inevitable. Then again, smaller parallel layer configs (ala googlenet) seem to be the fashion now. It's hard to see FFT catching up there. There's not enough dimensionality to hide your latencies and the filters are all small.

In another overlooked area, my gemm kernels are killing cublas on RNN networks. The new neon will have lots of examples of those with benchmarks.

We're also playing with nvrtc to see if we can't speed up pycuda compilation times some, though that overhead is already pretty minimal. You generally have a few seconds to wait the very first time you run something, but after that everything runs instantly. Autodif works at runtime with no overhead (after being cached).

pranv commented 9 years ago

@fchollet, what do you think about having a separate backend module in keras, from which relavent stuff can be imported elsewhere. That way, multiple backends can be supported.

elanmart commented 9 years ago

@fchollet Just out of curiosity, what are those 3 potentail backends (cgt, nervana, ?)?

fchollet commented 9 years ago

@pranv that's exactly how it will be done (Theano will still be an option for backwards compatibility, so what we are aiming at is the ability to plug different backends). And the backend module in Keras will have an API that basically follows Theano's (maybe a tad syntactically friendlier).

pranv commented 9 years ago

@fchollet, I was trying to get some of keras to work with CGT. Some places are really simple, replacing T by cgt suffices.

However, I found some of the things still missing, for example clip and round. They might require more effort. @joschu

Regardless, future of keras with such flexibility looks really exciting. It can really become the default DL library, both for masses and organizations.

NickShahML commented 8 years ago

Regardless, future of keras with such flexibility looks really exciting. It can really become the default DL library, both for masses and organizations.

Couldn't agree more

From the people I've talked to on reddit and more, they frequently recommend neon as its faster for rnn's including lstms compared to Theano. I think they are coming with an update to support many more features that could handle sequence to sequence learning. https://redd.it/3q2w7g

scott-gray commented 8 years ago

It's pretty high on my list of priorities to finish an embeddings lookup table kernel for bprop. This allows you to sum the gradient over unique words. I should be able to avoid atomics and do it deterministicly. Hope to have that sometime this week. I believe that's the only major thing we're missing for RNNs.