Closed XueFuzhao closed 7 months ago
Hi, we are not planning to release this prompt as we are still iterating it for our internal projects.
In general, you should specify some guidelines in the prompt such as the attributes, spatial relationships, actions, and OCRs if possible.
Thanks for the great work! May I know the template used to generate captions with gpt-4v?
In the figure of the paper, we can see "Imagining yourself as a customer service agent overseeing an uploaded video. The video comprises a sequence of frames..."
But it is not complete. Could you please provide a completed version?