Could you please consider making the detailed production process of the GVC-R dataset public?

You work about how to make a GVC dataset is wonderful, but I can't clearly learn from the paper about how to task GPT-4 with a good prompt, so can you give us more details about the prompt?
This part of the process mentioned in the original article is as follow but I can‘t think about how to do : "We task GPT-4 with matching noun phrases from the sentence to the GT instances. Once noun phrases are successfully grounded by GPT-4, we mark them with special start tokens, ⟨gs⟩ and ⟨ge⟩, followed by a token, ⟨seg⟩, which corresponds to the output feature used by the grounding model to segment the grounded region."

I would be grateful if you could provide more details about that part！ thank again！

UX-Decoder / LLaVA-Grounding

Could you please consider making the detailed production process of the GVC-R dataset public? #20