dvlab-research / MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Apache License 2.0
3.19k stars 278 forks source link

可以放一下生成generation_pure_text数据的代码吗 #109

Closed pennypengpm closed 4 months ago

pennypengpm commented 4 months ago

可以放一下生成generation_pure_text数据的代码吗,感谢

berry-ding commented 4 months ago

+1

JulianJuaner commented 4 months ago

The process of generating data involves handling different data sources and formats (such as filtering English data, keeping the format consistent, etc.), and the overall process is quite cumbersome. However, we have already provided the GPT4 prompt we used in Figure 7 of the Appendix of the paper. Here is also some related information for your reference:

SD Prompts (example captions): https://www.gigasheet.com/sample-data/stable-diffusion-prompts Some example queries: [ "Show a serene lakeside at dawn.", "Depict an astronaut with Earth in the background.", "Generate a medieval market.", "Generate a neon-lit futuristic city at night.", "Portray a lone oak tree in autumn.", "Create an image of a dragon on a mountain.", "Generate a 1920s jazz club scene.", "Visualize a quiet library with old books.", "Present a meadow with wildflowers and a rainbow.", "Draw a snowy Christmas village with playing children.", "Show a Japanese garden in cherry blossom season.", "Depict a lighthouse in a stormy sea.”…. (You can generate more by using GPT4) ] Prompt (Appendix Fig. 7)

Screenshot 2024-05-11 at 6 12 07 PM
JulianJuaner commented 4 months ago

Ours generation-related data: https://huggingface.co/datasets/YanweiLi/MGM-Instruction/blob/main/mgm_generation_pure_text.json

pennypengpm commented 4 months ago

Thanks