tristandeleu / ntm-one-shot

One-shot Learning with Memory-Augmented Neural Networks
MIT License
421 stars 94 forks source link

Why does ww_t have nb_reads components? #4

Closed markpwoodward closed 7 years ago

markpwoodward commented 7 years ago

Thank you @tristandeleu for this library, it has helped me better understand the paper. I am implementing a tensorflow version of the ntm-lrua model, and I have a question about your implementation.

Why do W_add, b_add, a_t, sigma_t, ww_t all have nb_reads elements? The paper seems to have only one "write head" as I gathered from the text and Figure 7.

The paper does explicitly talk about wlu_tm1 containing nb_reads 1's, which would mean we write the single a_t identically to nb_reads locations. That doesn't seem to make sense.

Any thoughts would be greatly appreciated. Thank you

tristandeleu commented 7 years ago

Hey Mark, I'm glad the code helped you! Indeed I may have taken some liberties from the paper:

Hope it makes more sense!

markpwoodward commented 7 years ago

Hi Tristan, Thanks again. That all makes sense. Your choices seem like the best fit to the paper to me.

snitchjinx commented 7 years ago

Hi @tristandeleu. I'm trying to understand the work of one shot learning by Google and found your project to be a nice learning material. But I'm a little confused by the usage of omniglot.py and test_model.py. It seems the controller of MANN has to be firstly trained in the hard way. Does the training in omniglot.py means this pre-stage training? I don't understand how test_batch_size() and test_shape() are related to the description in the paper. Would you please give some instructions on that or add more comments in the code? Thanks in advance!

DavidZhang88 commented 7 years ago

Hi @markpwoodward,can you execute this program? i met two errors in my issue,could you offer me some help? thank you so much.😭