There are some questions when I am working on the reimplementation of LLaVA-Med:
In the "GPT-4 Assisted Data Generation -- Generate visual instruct tuning conversations using GPT-4" process,
I saw the image caption file named "llava_med_instruct_fig_captions.json" was used, but it was not found in the corresponding file, I would like to know how to organize the image caption file in order to customize our own data, could you provide it?
Also in the "GPT-4 Assisted Data Generation" process, I would like to know what is the meaning of "inline mentions(IM)"? Is that the sentence that mentioned the corresponding figure in related papers?
There are some questions when I am working on the reimplementation of LLaVA-Med:
In the "GPT-4 Assisted Data Generation -- Generate visual instruct tuning conversations using GPT-4" process, I saw the image caption file named "llava_med_instruct_fig_captions.json" was used, but it was not found in the corresponding file, I would like to know how to organize the image caption file in order to customize our own data, could you provide it?
Also in the "GPT-4 Assisted Data Generation" process, I would like to know what is the meaning of "inline mentions(IM)"? Is that the sentence that mentioned the corresponding figure in related papers?