allenai / deep_qa

A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)
Apache License 2.0
404 stars 132 forks source link

Babi Dataset Reader #89

Closed DeNeutoy closed 8 years ago

DeNeutoy commented 8 years ago

Reader which reads any of the Babi datasets into a Dataset of BackgroundInstances.

matt-gardner commented 8 years ago

I said the label should be Option[List[Int]], but I really meant Option[Seq[Int]], or maybe even Option[Set[Int]]. I'd probably just go with Seq, though.

DeNeutoy commented 8 years ago

@matt-gardner - I've cleaned everything up and done the optimisations you suggested(thanks for those by the way - Scala is getting nicer by the day to write, for me). I've pushed the instance as a new class, but as you suggested we could replace them. If you want to do this I'll change the python TextDataset reader methods too.

In the paper introducing Babi, they use an RNN to generate multiple outputs for the tasks which require them. They also specifically state that the reason that the original MN paper does so badly at the tasks requiring multiple outputs is that it only outputs a single word.

matt-gardner commented 8 years ago

Convert bAbI dataset into a format that's usable with the QuestionAnswerMemoryNetworkSolver