Closed viswanathgs closed 5 years ago
thanks 😄
You're exactly right, and yeah, there was a bug in the code for the QA->R test embeddings. (The bug is just for this repo as it was introduced due to me hiding the test labels - it doesn't affect any of my results that I reported in the paper.) It's fixed now, and so you should be able to download the BERT embeddings from the S3 website again without having to recompute everything.
this should fix things, but let me know if it doesn't!
Hey @rowanz, thanks for the response and the latest fixes in https://github.com/rowanz/r2c/commit/f7b2374e9e1501f1d0a6cb8400a9702b692371c6. I did notice the bug in data_iter_test
that resulted in contextual embeddings not being generated correctly and I'm glad it's fixed now.
What I mentioned though was that, for obtaining Q->AR results on the val set conditioned on the Q->A output, we need to rely on the conditioned_answer_choice
mechanism you recently introduced as well. So val split also needs to generate all 16 embeddings for rationales (data_iter_test
instead of data_iter
) right?
But it's not a problem for me anymore as I've been able to run the sequence of steps (create_pretraining_data.py
, pretrain_on_vcr.py
and extract_features.py
) to generate such embeddings for val split, so all good now.
First up, thanks for the great work and releasing the code!
I'm trying to repro the baselines from the code and it works like a charm for Q->A and QA->R tasks, but I don't see any code for Q->AR task. Could you please share some details as to how this is computed?
Is the baseline validation accuracy of 43.1 mentioned in the paper for Q->AR task obtained by first running Q->A task, and conditioned on those predicted answers, running QA->R? If so, I believe this would mean the
bert_da
embeddings forctx_rationale<i>
needs to be recomputed based on the (question + predicted answer) as opposed to what's been precomputed (question + ground-truth answer). To avoid having to pretrain thebert_da
embeddings as mentioned in https://github.com/rowanz/r2c/blob/master/data/get_bert_embeddings/README.md, would you be able to share theinit_checkpoint
file that I could use inextract_features.py
?Thank you!