Closed pritamqu closed 1 year ago
Hi! Sorry for the late reply, and thanks for your interest in my work 🙂 The beauty of word2vec is that it is straightforward: 1 word 1 encoding, no long sentences. Maybe Bert is too complex for this model. I personally only used Word2vec to stay compatible with previous work and because improving the language model was not the scope of my work, but improving the computer vision model while keeping everything as previous protocol However, I saw that follow-up work is actually using more complex language models, unfortunately, I cannot find the paper right now.
Thanks for your comment @bbrattoli, I tried both using word2vec and better models like BERT, and it seems word2vec works better.
Hi, I follow your work and this is a great work, very simple and effective :) I am wondering did you try or know of similar training with Bert or a similar transformer model; I am trying something like that, but the loss seems to remain fairly steady, and the model is not learning anything. The same framework is working fine with word2vec, Do you know why this may happen? any intuitive thought? @bbrattoli