rowanz / r2c

Recognition to Cognition Networks (code for the model in "From Recognition to Cognition: Visual Commonsense Reasoning", CVPR 2019)
https://visualcommonsense.com
MIT License
466 stars 91 forks source link

Baseline for Q->AR #6

Closed viswanathgs closed 5 years ago

viswanathgs commented 5 years ago

First up, thanks for the great work and releasing the code!

I'm trying to repro the baselines from the code and it works like a charm for Q->A and QA->R tasks, but I don't see any code for Q->AR task. Could you please share some details as to how this is computed?

Is the baseline validation accuracy of 43.1 mentioned in the paper for Q->AR task obtained by first running Q->A task, and conditioned on those predicted answers, running QA->R? If so, I believe this would mean the bert_da embeddings for ctx_rationale<i> needs to be recomputed based on the (question + predicted answer) as opposed to what's been precomputed (question + ground-truth answer). To avoid having to pretrain the bert_da embeddings as mentioned in https://github.com/rowanz/r2c/blob/master/data/get_bert_embeddings/README.md, would you be able to share the init_checkpoint file that I could use in extract_features.py?

Thank you!

rowanz commented 5 years ago

thanks 😄

You're exactly right, and yeah, there was a bug in the code for the QA->R test embeddings. (The bug is just for this repo as it was introduced due to me hiding the test labels - it doesn't affect any of my results that I reported in the paper.) It's fixed now, and so you should be able to download the BERT embeddings from the S3 website again without having to recompute everything.

this should fix things, but let me know if it doesn't!

viswanathgs commented 5 years ago

Hey @rowanz, thanks for the response and the latest fixes in https://github.com/rowanz/r2c/commit/f7b2374e9e1501f1d0a6cb8400a9702b692371c6. I did notice the bug in data_iter_test that resulted in contextual embeddings not being generated correctly and I'm glad it's fixed now.

What I mentioned though was that, for obtaining Q->AR results on the val set conditioned on the Q->A output, we need to rely on the conditioned_answer_choice mechanism you recently introduced as well. So val split also needs to generate all 16 embeddings for rationales (data_iter_test instead of data_iter) right?

But it's not a problem for me anymore as I've been able to run the sequence of steps (create_pretraining_data.py, pretrain_on_vcr.py and extract_features.py) to generate such embeddings for val split, so all good now.