Input Fusion Layer from MN+ paper

allenai / deep_qa

A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)

Apache License 2.0

404 stars 132 forks source link

Input Fusion Layer from MN+ paper #101

Closed DeNeutoy closed 8 years ago

DeNeutoy commented 8 years ago

This is ready for review - the main thing to check is that my logic for why we can't TimeDistribute the KnowledgeEncoder in MCMemoryNetwork.py makes sense.

matt-gardner commented 8 years ago

Implement the GRU-based stuff from the DMN+ paper

DeNeutoy commented 8 years ago

Ok, so we now have an integrated knowledge_encoder which works with the MultipleChoiceMemory network by passing a flag to indicate whether we should TimeDistribute the BiDirectionalGRU to perform the BiGRU part on the background knowledge for each answer option. I think this is correct even though we don't TimeDistribute the function for the MCMN - I believe this is because we have already TimeDistributed the sentence encoder by overriding it, but perhaps I have missed something here.

DeNeutoy commented 8 years ago

Fixed all the formatting and changed the KnowledgeEncoder to use a class variable. Also, i've left the BiGRU using 'sum' to combine them, as that's what it is in the DM+ paper - I can make this flexible if you want as I generally agree that concat is better than sum.

matt-gardner commented 8 years ago

Why don't you just add a TODO to make the BiGRU configurable? No need to worry about it now, if that's what the DMN+ paper does.