Closed DeNeutoy closed 8 years ago
Ok, so we now have an integrated knowledge_encoder which works with the MultipleChoiceMemory network by passing a flag to indicate whether we should TimeDistribute
the BiDirectionalGRU to perform the BiGRU part on the background knowledge for each answer option. I think this is correct even though we don't TimeDistribute
the function for the MCMN - I believe this is because we have already TimeDistributed
the sentence encoder by overriding it, but perhaps I have missed something here.
Fixed all the formatting and changed the KnowledgeEncoder to use a class variable. Also, i've left the BiGRU using 'sum' to combine them, as that's what it is in the DM+ paper - I can make this flexible if you want as I generally agree that concat is better than sum.
Why don't you just add a TODO to make the BiGRU configurable? No need to worry about it now, if that's what the DMN+ paper does.
This is ready for review - the main thing to check is that my logic for why we can't TimeDistribute the KnowledgeEncoder in MCMemoryNetwork.py makes sense.