Closed markpwoodward closed 7 years ago
Hey Mark, I'm glad the code helped you! Indeed I may have taken some liberties from the paper:
wr_tm1
has a corresponding ww_t
. Similarly I considered that wlu_tm1
was not a single vector with nb_heads
1's but nb_heads
one hot vectors, to ensure that ww_t
remains a proper distribution (sums to 1).sigma_t
has nb_reads
elements as well ; one for each write head.k_t
and a_t
. I did that to match the NTM paper, which separates the key to query the memory from what is added to the memory. Again these have nb_reads
elements as well for each read/write head.Hope it makes more sense!
Hi Tristan, Thanks again. That all makes sense. Your choices seem like the best fit to the paper to me.
Hi @tristandeleu. I'm trying to understand the work of one shot learning by Google and found your project to be a nice learning material. But I'm a little confused by the usage of omniglot.py and test_model.py. It seems the controller of MANN has to be firstly trained in the hard way. Does the training in omniglot.py means this pre-stage training? I don't understand how test_batch_size() and test_shape() are related to the description in the paper. Would you please give some instructions on that or add more comments in the code? Thanks in advance!
Hi @markpwoodward,can you execute this program? i met two errors in my issue,could you offer me some help? thank you so much.ðŸ˜
Thank you @tristandeleu for this library, it has helped me better understand the paper. I am implementing a tensorflow version of the ntm-lrua model, and I have a question about your implementation.
Why do
W_add
,b_add
,a_t
,sigma_t
,ww_t
all havenb_reads
elements? The paper seems to have only one "write head" as I gathered from the text and Figure 7.The paper does explicitly talk about
wlu_tm1
containingnb_reads
1's, which would mean we write the singlea_t
identically tonb_reads
locations. That doesn't seem to make sense.Any thoughts would be greatly appreciated. Thank you