tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Apache License 2.0
5.08k stars 331 forks source link

Question about training IP-Adapter on LAION-2B #337

Open SadAngelF opened 5 months ago

SadAngelF commented 5 months ago

Hello, thank you for your excellent work.

In the paper, the IP-Adapter is trained on Laion2B, where says "During training, we resize the shortest side of the image to 512 and then center crop the image with 512 × 512 resolution."

As I know, Laion2B can be downloaded by different size (240TB in 384, 80TB in 224 from https://laion.ai/blog/laion-5b/).

I am afraid of the storage of my machines. So I want to know what is the image_size of raw data in Laion2B you used. 384? 224? or 512 as the paper says? And how much storage space of the whole dataset with corresponding image_size.

Thank you very much~ Hope for your reply~~

xiaohu2015 commented 5 months ago

hi,I filtered images with short side greater than 1024 to train. I think you can use https://huggingface.co/datasets/laion/laion-high-resolution or https://huggingface.co/datasets/laion/datacomp-hq

SadAngelF commented 5 months ago

Thanks very much for replying. The high resolution is simple to understand. I didn't know the datacomp-hq, and is it the subset of DataComp-1b?

xiaohu2015 commented 5 months ago

Thanks very much for replying. The high resolution is simple to understand. I didn't know the datacomp-hq, and is it the subset of DataComp-1b?

yes

SadAngelF commented 5 months ago

Thanks very very much! Let me have a try with that.

SnailForce commented 5 months ago

I wanna to know how to download laion dataset now, do you have any way?

hi,I filtered images with short side greater than 1024 to train. I think you can use https://huggingface.co/datasets/laion/laion-high-resolution or https://huggingface.co/datasets/laion/datacomp-hq

SadAngelF commented 5 months ago

maybe from here https://the-eye.eu/public/AI/cah/laion5b/. or use this sub dataset https://huggingface.co/datasets/limingcv/LAION_Aesthetics_1024??

garychan22 commented 3 weeks ago

hi, how many training iterations are expected to see something, like the generated image is somewhat related to the input reference image?