some question about the paper and code details

youngornever commented 5 years ago

Thanks for your excellent work and code release. I have several questions: a) I think it is not reasonable to use the test data to build the vocabulary, though this will not have an impact on the accuracy. b) I wonder why you did not report the OOV results while listing them in the README.md. c) about the hyperparameters for memory network (see Table 14 at Page 15 in your paper), I do not find the usage of "margin" and "negative candidates". Also, I notice that the two hyperparameters are in Table 9 at Page 14 in this paper "Learning End-To-End Goal-Oriented Dialog". I think the two hyperparameters are used in the setting of supervised embedding. Did I miss something?

youngornever commented 5 years ago

d) I notice that q_{k+1} = qk + R * \sum{i}{p_i mi} in the paper https://arxiv.org/pdf/1605.07683.pdf, while you implemented it as q{k+1} = H qk + \sum{i}{p_i * m_i}, (H and R are the same thing with different names) which is the layer-wise weight tying in the paper https://arxiv.org/pdf/1503.08895.pdf. Why did you perfer the later?

chaitjo commented 5 years ago

Hi @youngornever and thanks for your interest! Apologies for the late response. As this work was done a couple of years ago, I'll try to answer your questions to the best of my memory.

a) I retrospectively agree...

b) While I honestly do not remember why, maybe it was because we wanted to focus our analysis to other aspects of the model/problem such as the personalization stuff.

c) Yes, I don't think these two hyperparameters are used in the Memory Network model. I think we provided them in the table because we were confused about this too, and decided to follow the style in Table 9 from Learning End-to-end Goal-Oriented Dialog.

d) We were using an approved TF re-implementation of FAIR's original Lua code for Memory Networks. At that time, we did not focus too much on making the model work as best as it could by tweaking these things. Our plan was to use the Memory Network model as a way to explore the question of personalization because it is easy to visualize how the model does reasoning.

chaitjo / personalized-dialog

some question about the paper and code details #8