apapiu / transformer_latent_diffusion

Text to Image Latent Diffusion using a Transformer core
MIT License
123 stars 12 forks source link

Data source? #20

Open cloneofsimo opened 4 months ago

cloneofsimo commented 4 months ago

Hi there! Im trying to make minRF, and there was a pointer to here, I was wondering what dataset you used for this! Thanks!

apapiu commented 4 months ago

Hey! Copying my answer from a previous issue - "The data - this is a big one - the full GRIT data might contain a lot of low quality images and/or prompts. Most of the data I used was either synthetic or filtered by CLIP aesthetic score. Try the mj_latents.npy and mj_text_emb.npy from here https://huggingface.co/apapiu/small_ldt/tree/main - this is higher quality synthetic data - I think about 600k examples if I remember correctly." Or you can use the data processing to download any data that has image and caption pairs from huggingface.