Closed hunterheiden closed 2 months ago
For sure, happy to provide these results and details of my setup. I've run this using torch version 2.1.2. + CUDA 12.4. I'm running on an Azure VM (Standard NC96ads A100 v4 (96 vcpus, 880 GiB memory), 4xA100(80GB).
Just as a short reference, these are the results for ACC@IoU=0.5:
Dataset | Split | liuhaotian/llava-v1.5-7b |
---|---|---|
RefCOCO | val | 56.2 |
RefCOCO | test | 58.1 |
RefCOCO | testA | 64.4 |
RefCOCO | testB | 47.5 |
RefCOCO+ | val | 50.0 |
RefCOCO+ | testA | 59.2 |
RefCOCO+ | testB | 39.0 |
RefCOCOg | val | 48.8 |
RefCOCOg | test | 48.4 |
Additionally, I'll attach the result summaries (as .txt files) for the different versions of COCO:
Let me know if you want me to directly screenshot / image the results, or if this is sufficient information! I have some more REC tasks I'd like to contribute, as well as some evaluations on screen-based benchmarks (ScreenSpot, RICO tasks, etc.).
I'm also running some these on the v1.5-13b model to see the shift, so I'll circle back on the results there too when I have them.
Thanks for your PR! We are checking it and will have some discussions about the changes in main pipline (these files inside the apis folder). And if you are ready with the 13b results, you can also put here and we will try to finish this PR asap.
Sounds good. I should hopefully have benchmarked the 13B model in the next day or two.
Regarding the core file changes, if there's another way to achieve similar functionality that's more in-line with how you all manage this package, I'm happy to change approaches. The main issue was that I needed the width and height information of images in order to normalize bounding boxes. I also wanted to avoid re-loading the datasets multiple times, especially for splits that aren't needed.
@kcz358 @jzhang38 Please help to check it thanks!
This PR adds REC evaluations, assuming that bounding boxes will be output as raw square brackets and coordinates. It adds REC eval for all RefCOCO sets (RefCOCO, RefCOCO+, RefCOCOg). Specifically:
.yaml
files are added, both for the defaults and specifics for splitsutil_rec.py
for each set (identical across RefCOCO sets)