[New Model] Better QA - Neural QA

koustuvsinha commented 7 years ago

We have a followup questions model but it only generates very basic one-liner questions ("what","why","huh"). Since we have documents and articles I propose we use something like this : Neural QA (accompanying paper) . The original code, again, is in Torch7. Would it be useful to use it as is or port it quickly to PyTorch?

Breakend commented 7 years ago

Quick hack rather than porting it can run it from command line with a python wrapper?

koustuvsinha commented 7 years ago

Yeah given the dependencies can be installed as is.

koustuvsinha commented 7 years ago

Trained models are not provided though 😞

Breakend commented 7 years ago

darn. that sucks. it's a nice model though.

koustuvsinha commented 7 years ago

On reading the paper, it seems to me its nothing fancy, just a simple attention based encoder-decoder where the input is the sentence(s) containing the question and the output is the actual question from Squad. Some technical details:

Bidirectional LSTM attention encoder (hidden size : 600, layers 2)
LSTM decoder (hidden size : 600, layers 2)
Pretrained word embeddings, glove (300dim)
SGD optimization with initial LR 1.0 (half lr at epoch 8)
dropout 0.3
beam search 3

Total training takes 2 hours. I guess replicating this setup wouldn't be that of an issue

Breakend commented 7 years ago

true, actually probably have enough pytorch code to just replicate it without too much hassle (hopefully)

koustuvsinha commented 7 years ago

yes! just starting by using the basic Seq2Seq with attention would be enough in this case

koustuvsinha commented 7 years ago

Update: trained the model on my own pytorch implementation (will update the code tomorrow). Loss converged to 5.xx. Using beam search the outputs weren't that good.

So, installed Torch7 and trained the model on their present configurations. Training achieved 20 perplexity as mentioned. Few samples from evaluation results are given below:

SENT 10504: through the work of leading theoretical physicists , a new theory of electromagnetism was developed using quantum mechanics .
PRED 10504: what theory was developed using quantum mechanics ?
PRED SCORE: -5.3748

SENT 10505: this final modification to electromagnetic theory ultimately led to quantum electrodynamics -lrb- or qed -rrb- , which fully describes all electromagnetic phenomena as
being mediated by wave -- particles known as photons .
PRED 10505: what is another term for the final amendment to electromagnetic theory ?
PRED SCORE: -6.7339

SENT 10506: this final modification to electromagnetic theory ultimately led to quantum electrodynamics -lrb- or qed -rrb- , which fully describes all electromagnetic phenomena as
being mediated by wave -- particles known as photons .
PRED 10506: what is another term for the final amendment to electromagnetic theory ?
PRED SCORE: -6.7339

SENT 10507: this final modification to electromagnetic theory ultimately led to quantum electrodynamics -lrb- or qed -rrb- , which fully describes all electromagnetic phenomena as
being mediated by wave -- particles known as photons .
PRED 10507: what is another term for the final amendment to electromagnetic theory ?
PRED SCORE: -6.7339

SENT 10508: it is a common misconception to ascribe the stiffness and rigidity of solid matter to the repulsion of like charges under the influence of the electromagnetic force .
PRED 10508: what is a common reason to ascribe the stiffness of the stiffness and rigidity of solid matter to the repulsion of like ?
PRED SCORE: -15.2773

SENT 10509: however , these characteristics actually result from the pauli exclusion principle .
PRED 10509: what principle did the characteristics of the characteristics of the characteristics of the characteristics of a species have ?
PRED SCORE: -21.5661

SENT 10510: this means that it takes energy to pack them together .
PRED 10510: what does it take to do with the energy that takes energy to get together together ?
PRED SCORE: -15.8889

So as you see the model results are pretty good! It also provides a nice PRED score which can be used to prune the questions. Will write a wrapper tomorrow and include it in our docker.

P.S. Still I would like to perfect my pytorch code, probably after the competition!

koustuvsinha commented 7 years ago

[x] Figure out how to run in cpu
[x] create wrapper

koustuvsinha commented 7 years ago

Idea 💡 - how can we handle user response : correct / wrong answers?

koustuvsinha commented 7 years ago

The code can be run on CPU, although this involves running a server in lua. Uggh. ~Trying to use Luvit~ Used another simple rest interface instead.

NicolasAG commented 6 years ago

solved with this commit

noseworm / convai

[New Model] Better QA - Neural QA #10