AndreiBarsan / visualqa

Visual Question Answering project for ETHZ's 2016/17 Deep Learning class.
Apache License 2.0
5 stars 2 forks source link

Image CNN #4

Open AndreiBarsan opened 7 years ago

AndreiBarsan commented 7 years ago

A simple baseline is described here: https://arxiv.org/pdf/1512.02167.pdf (Simple Baseline for Visual Question Answering)

We have almost all of this already, except for the image CNN. The images are now just mapped directly to pre-computed feature vectors, so it would be nice to add the CNN "arm" of the neural network in order to have a "proper" baseline.

AndreiBarsan commented 7 years ago

Okay, I'll try to get this going, with a basic VGGnet or something at first.

AndreiBarsan commented 7 years ago

I will skip this and jump straight into a fancier model, so that we can make sure we're also covering a novel technique not covered in the lecture.