Open mmderakhshani opened 1 week ago
Hi, you can find journeydb annotation here and questions.json
in the directory ./training
. For laion12m, you can recaption it using the off-the-shelf MLLMs like Qwen series or ShareGPT-V.
Perfect, thanks a lot for this. Could you please let me know what your prompt is for recaptioning?
Hi, maybe you can try "Describe this image and its style in a very detailed manner” or “Describe this image in as much detail as possible”.
Perfect. Thanks for this. I will try and get back to you if you do not mind.
Feel free to ask :)
Hi there,
Thank you for sharing this excellent GitHub repository.
The Laiona-aesthetic-12m and JourneyDB datasets have been recaptioned using the ShareGPT4V model in both the second and third stages of training.
We are working on reproducing your results and have successfully completed the first stage. To continue with the training, we would like to request the following three annotations:
and
/mnt/bn/vgfm2/test_dit/LlmDiffuser_phi1.5/LlmDiffuser/questions.json
Could you please share these items with us as they are blocking our reproduction of your GitHub repo?
If sharing these files is not possible, could you provide the code to regenerate them at least? This way, we can handle the recaptioning internally. Much appreciated.