Open joeyy5588 opened 3 years ago
for nocaps, the image features are generated from the model trained in BUTD paper. The object detector trained on open images is described here: https://storage.googleapis.com/openimages/challenge_2019/challenge_results/objdet_5thplace.pdf
Hello, would you mind introducing more details about tagging selection strategy from the open images detector? It's only briefly mentioned in VIVO that the maximum length of tags is 30 (finetuning) and 15 (pretraining) and they are composed of tags produced by the OI detector and groundtruth tags (?). Thanks a bunch!
Thanks for your amazing work!
I've checked the description in DOWNLOAD.md and I can't find the feature files for novel object captioning (nocaps). Is it possible for you to release data / detector pretrained on Open Images for nocaps?
I'd be grateful if you could let me know how to obtain the object tags and image features for nocaps, and I'd also appreciate for any details for reproducing the nocaps results.
Thanks in advance!