Question about the parameter "np_num" in BLIP_vqa.py

Karine-Huang / T2I-CompBench

[Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation

https://arxiv.org/pdf/2307.06350.pdf

MIT License

213 stars 7 forks source link

Question about the parameter "np_num" in BLIP_vqa.py #9

Closed TianyunYoung closed 11 months ago

TianyunYoung commented 11 months ago

Hi~ I am asking about how "np_num" is set when calculating the VQA results in Table2 and Table 4 for Color, Shape, Texture, and Complex. As the number of noun phrases is usually 2, however the default np_num is 8. Should I use a smaller np_num?

Karine-Huang commented 11 months ago

Hi!

For the categories Color, Shape, and Texture, the noun phrases in the first 80% part, are usually at 2 (follow fixed sentence template “a {adj} {noun} and a {adj} {noun}”). For the last 20% part, the prompts are constructed without predefined sentence template, and they contain more noun phrases.

In the case of the Complex category, the earlier part involves "two objects" and the later part involves "multiple objects" with potentially more noun phrases.

In this context, a larger np_num may be appropriate for the calculation. We have used np_num set to 8 for both Table 2 and Table 4.

TianyunYoung commented 11 months ago

Hi!

For the categories Color, Shape, and Texture, the noun phrases in the first 80% part, are usually at 2 (follow fixed sentence template “a {adj} {noun} and a {adj} {noun}”). For the last 20% part, the prompts are constructed without predefined sentence template, and they contain more noun phrases.

In the case of the Complex category, the earlier part involves "two objects" and the later part involves "multiple objects" with potentially more noun phrases.

In this context, a larger np_num may be appropriate for the calculation. We have used np_num set to 8 for both Table 2 and Table 4.

I get it! Thank for your response.