djiajunustc / TransVG

157 stars 26 forks source link

pretrained checkpoints #10

Closed seanzhuh closed 2 years ago

seanzhuh commented 2 years ago

Hello, in Pretrained Checkpoints Gdrive, there is only 3 checkpoints, where detr-r50-gref.pth means resnet50-refcocog-google, detr-r50-unc.pth means resnet50-refcoco-unc, correct me if I'm wrong. Could you please release the pretrained resnet50 checkpoints for refcoco+/refcocog-umd and also, the resnet101.

djiajunustc commented 2 years ago

Hi @sean-zhuh,

The overlapped images for RefCOCO (unc) and RefCOCO+ (unc+) are the same. Thus, detr-r50-unc is the pretrained model for both datasets. Besides, we merge the overlapped images from Refcocog g-split (gref) and umd-split (gref_umd) together when performing pretraining, so that the detr-r50-gref is the pretrained model for both datasets too.

We find it makes little difference to use a larger backbone when we try to reproduce the results in this repository. That's the reason why we only release the resnet50 pretrained parameters. The repository is still under development, and I'll update you if there are other new supports.

seanzhuh commented 2 years ago

Glad to hear! Thansks to clarify!

jianghaojun commented 2 years ago

@djiajunustc is there a checkpoint for Flickr30K dataset?

djiajunustc commented 2 years ago

The checkpoint for Flickr30K is the same as that for ReferItGame, since these two datasets have no images overlapped with MSCOCO.

jianghaojun commented 2 years ago

Thanks~

seanzhuh commented 2 years ago

Hi again, I've noticed that the performance in this repository is contradictory to that in your paper, where on RefCOCO TestB set, you claim in the paper that TransVG reaches ~78, however in this repo, it only achieves ~75. Overall, thanks for your great work. I do hope this result could be updated on your paper (arxiv). The mismatched results between the repository and your paper may mislead readers and followers in the community.