Open JoanFM opened 4 years ago
Hi, I'm wondering whether you tried it and have some insights to provide? I'm interested by the same thing
Hi, I'm wondering whether you tried it and have some insights to provide? I'm interested by the same thing
Hey @rom1504 I did not get any feedback, so I did not proceed with this paper, I found another paper to do Caption-Based Image Retrieval
https://github.com/fartashf/vsepp with really nice and easy implementation. (The results claimed in the paper are not so nice but it is good for a first implementation of such a system...)
@JoanFM @rom1504
Hi guys,
I wrote a step-by-step tutorial on how to set up the environment, train and test this model. I also added a section on extracting the visiolinguistic embeddings from the image-text data. https://naserian-elahe.medium.com/vilbert-a-model-for-learning-joint-representations-of-image-content-and-natural-language-47f56a313a79 I very much appreciate any comments or suggestions
@enaserianhanzaei I followed your tutorial for Image Retrieval but I am getting very low final values of recall. Do you have any idea what could have been wrong here?
I would like to know if the pre-trained model given by this link (https://dl.fbaipublicfiles.com/vilbert-multi-task/pretrained_model.bin) can be used for Caption-Based Image Retrieval.
My first guess is that I can load the model using (not sure if the configuration file is the proper one):
Afterwards I have seen digging in the code that running the inner bert model should give the sequence outputs for text and for image.
I have several questions:
image_loc
parameter?I hope I made myself clear
Thank you very much