Simple comparisons with other rnn core models

google-deepmind / dnc

A TensorFlow implementation of the Differentiable Neural Computer.

Apache License 2.0

2.49k stars 442 forks source link

Simple comparisons with other rnn core models #8

Closed AjayTalati closed 7 years ago

AjayTalati commented 7 years ago

Hi Jack @dm-jrae,

on the front page it says,

More generally, the DNC class found within dnc.py can be used as a standard TensorFlow rnn core and unrolled with TensorFlow rnn ops, such as tf.nn.dynamic_rnn on any sequential task.

Be fun if you could demonstrate this for something very simple like the basic TF rnn tutorial. I guess the quickest way to start using the dnc is by replacing it in simple familiar applications.

Thanks, Aj

dm-jrae commented 7 years ago

Hi Aj, there's no current plans to add a suite of demos, however as you state the dnc core adheres to the TF rnn interface so you should be able to pick up any task's training script and get going. Or pick up other rnns and compare them to the DNC. Contributions are accepted if you want to add a comparison script for a given task of interest.

AjayTalati commented 7 years ago

Hi Jack @dm-jrae,

I think I'm starting to get a feel for how to use the DNC, I've done a simpler/more basic implementation :).

Now I'd like to try to reproduce the parts of the paper which learnt graphical representations - in particular,

In the paper, we showed that a DNC can learn on its own to write down a description of an arbitrary graph

can you recommend any simple dataset to start with please?

It doesn’t have to be the London underground, or a family tree, anything small, open-source, and in Python, which you guys know the DNC can learn, would be a really massive help :+1:

If there's nothing out of the can in Python, I could try to generate sample networks using R, and then port it over to numpy? Sorry for the weird question, would be happy to contribute this task if I can get it to work?

Thanks a lot,

Ajay

AjayTalati commented 7 years ago

Hi @jingweiz,

I wonder if you're interested in reproducing the graph representation tasks, (and the subsequent querying), from the paper?

AjayTalati commented 7 years ago

Oh dear - I made bit a bit of boob :-1:

On page 9 of the paper, in the section Graph Task Descriptions -> Random Graph Generation, it tells you how to generate the planar graphs.

I thinks the same method is used in a Google Brain paper I read on Combinatorial optimization, so I guess you DM/Google guys use this as one of your standard task generators?

jingweiz commented 7 years ago

@AjayTalati Oh hey, sorry I just saw your message! I am indeed currently implementing ntm and dnc in pytorch, I'm done with ntm, and finishing up dnc, and I currently only have the copy task and repeat_copy task and will make the code public very soon. I am very excited about the external memory idea and would definitely want to have more tasks and would be very happy to cooperate:) So you said you also have an implementation already right? Which framework do you use?

AjayTalati commented 7 years ago

Hi @jingweiz

thanks for the offer of co-operation, that's really cool, thank you :)

My implementation in pytorch is a simplified version of

https://github.com/ypxie/pytorch-NeuCom

So far I'm really just getting used to how it works, it doesn’t seem to scale too well for large inputs, but I guess I need to implement sparse read and write? I think external memory is very interesting too, the capacity seems very promising, and I hope it will learn faster than LSTM?

To be honest I'm only working on applications to very simple things at the moment, like basic time series sequence prediction, but if I get promising results, I'll be happy to move on to the more complicated tasks, and can offer you assistance :)

Thanks a lot for the reply :+1:

Cheers,

Ajay

jingweiz commented 7 years ago

Hey @AjayTalati From their Figure4 it shows the sparsity does not seem to make too much of a difference performance wise, as to deal w/ large inputs I'm not sure, but maybe it's worth a try! As for LSTM, I think in the NTM paper they pretty much already show that external memory performs better and learns faster. And thanks for the reply! Good luck and have fun with all the implementations:D

AjayTalati commented 7 years ago

Thanks @jingweiz,

I'm running some reasonably large time series and language model experiments - will update you when I get some conclusive results.

Looking forward to your implementation - I think for RL the DNC shows a lot of promise :+1: - best of luck :1st_place_medal: