Closed seanzhuh closed 2 years ago
Hi @sean-zhuh,
The overlapped images for RefCOCO (unc) and RefCOCO+ (unc+) are the same. Thus, detr-r50-unc is the pretrained model for both datasets. Besides, we merge the overlapped images from Refcocog g-split (gref) and umd-split (gref_umd) together when performing pretraining, so that the detr-r50-gref is the pretrained model for both datasets too.
We find it makes little difference to use a larger backbone when we try to reproduce the results in this repository. That's the reason why we only release the resnet50 pretrained parameters. The repository is still under development, and I'll update you if there are other new supports.
Glad to hear! Thansks to clarify!
@djiajunustc is there a checkpoint for Flickr30K dataset?
The checkpoint for Flickr30K is the same as that for ReferItGame, since these two datasets have no images overlapped with MSCOCO.
Thanks~
Hi again, I've noticed that the performance in this repository is contradictory to that in your paper, where on RefCOCO TestB set, you claim in the paper that TransVG reaches ~78, however in this repo, it only achieves ~75. Overall, thanks for your great work. I do hope this result could be updated on your paper (arxiv). The mismatched results between the repository and your paper may mislead readers and followers in the community.
Hello, in Pretrained Checkpoints Gdrive, there is only 3 checkpoints, where detr-r50-gref.pth means resnet50-refcocog-google, detr-r50-unc.pth means resnet50-refcoco-unc, correct me if I'm wrong. Could you please release the pretrained resnet50 checkpoints for refcoco+/refcocog-umd and also, the resnet101.