For the Image<->Text retrieval results (Table 1), did you only consider exact ground-truth reports as a match or did you consider all those correct that contain the same classes, as is done in the other works in that table.
You say you have trained the other models in that Table. Why did you do that, e.g. Gloria's weights are available to download?
Are you planning to release your trained model weights?
Hi, I have a few questions:
Thank you :)