Closed Faye3321 closed 4 years ago
Hi, thank you for the interest in our work!
Could you possibly share the args you ran the experiments on and the results you are getting? Thanks!
For training, I was using your suggested settings. For your provided trained model, the joint coherence on CUB I got is around 0.1 (VS 0.263 in Table 4)
Thanks for bringing the issues to our attention. We've now tracked down the reason for discrepancies between the released code and the reported results -- in the effort to clean up and publish our code, there were a couple of minor things we missed out.
We have fixed the above in the most recent commit. Along with the code update, we have also uploaded new pretrained-models for both MNIST-SVHN and CUB datasets that will reproduce similar results to what's reported in Table 2 and Table 4 of our paper -- see README for more details.
We do apologise for the confusion this inconsistency caused, and thank you again for bringing this forward.
To expand a bit on (2), for measuring cross/joint coherence on CUB, we use off-the-shelf ResNets for the cub-image data, but train FastText embeddings for cub-sentence data -- since it's vocabulary is quite different from what is typically used to train FastText.
We then use these embeddings (ResNet, trained FastText) to compute CCA on the ground-truth image and sentence training data, and use the learnt embeddings to compute the correlation for generated samples from our model.
The learnt embeddings however, can vary quite a bit due to the limited dataset size. The embeddings we used to report results in the paper were not saved with our models, so re-computing them as part of the analyses can result in different numeric values including for the baseline. Note that the relative performance of our model against the baseline remains the same, just that the numbers can be different.
We have done a quick search for the FastText embeddings that produce the same results on the baseline as reported in the paper, and re-computed the CCA and cross/joint coherence scores on our model with this. To produce similar results to what's reported in our paper, download the zip file here and do the following:
cub.all
, cub.emb
, cub.pc
to under data/cub/oc:3_sl:32_s:300_w:3/
;emb_mean.pt
, emb_proj.pt
, images_mean.pt
, im_proj.pt
to path/to/trained/model/folder/
;RESET
variable in src/report/analyse_cub.py
to False
(line 21).With these two fixes, the results from the code match those in the paper (with even improved cross-coherence scores on cub )
Thanks for your clarifications
Hi! Thanks for sharing this great project! I trained the model with your suggested settings and also evaluated your provided trained model, but in both ways I can't reproduce results as reported in Table 2 and Table 4, especially joint coherence. Can you give any hints? Thanks.