@JoaoF2025 I saw that you commited that code with "image flip", did you check anything about object identification?
For the future we could implement something for VQA as like "what's the colour of the Tshirt I'm using" or "how many fingers I have on my hand"
@JoaoF2025 I saw that you commited that code with "image flip", did you check anything about object identification? For the future we could implement something for VQA as like "what's the colour of the Tshirt I'm using" or "how many fingers I have on my hand"