UX-Decoder / LLaVA-Grounding

Apache License 2.0
301 stars 11 forks source link

Evaluation on refcoco datasets. #18

Open juxingyiwan opened 5 months ago

juxingyiwan commented 5 months ago

Dear authors: I'm really interested in your work. We are reproducing outstanding visual grounding works on refcoco datasets, could you please release evaluation scripts on refcocos?

liujunzhuo commented 4 months ago

Dear authors: I'm really interested in your work. We are reproducing outstanding visual grounding works on refcoco datasets, could you please release evaluation scripts on refcocos?

Hi, I have a similar question regarding visual grounding evaluation. I've been using prompts from the paper, like "Please segment the man. (with grounding)," but sometimes end up with multiple targets, and sometimes, there's no \. It's not ideal. Any tips or solutions you've come across? Also, I'm having issues with batch inference - the results are different when the batch size is not 1. Do you happen to have any insights into this matter? I appreciate your help!

ApolloRay commented 2 days ago

same question.