avisingh599 / visual-qa

[Reimplementation Antol et al 2015] Keras-based LSTM/CNN models for Visual Question Answering
https://avisingh599.github.io/deeplearning/visual-qa/
MIT License
481 stars 186 forks source link

Issue with evaluation code #8

Closed avisingh599 closed 8 years ago

avisingh599 commented 8 years ago

There's an issue with the evaluation code.

My old definition of accuracy: If the Neural Net-generated answer matches at least three human answers, then the accuracy of that answer is 1, else 0.

Actual definition of accuracy in the VQA challenge: Let n be the number of human answers that exactly match the neural net answer. Then acc = min(n/3, 1). This gives a score of 0.33 if there is exactly one match between human and neural net, and 0.66 if there are exactly two matches.

I will be fixing this and updating the results soon. Should give a bump to the validation set performance numbers that I reported earlier.