Quick Question: what is the actual training dataset volume?

TencentARC / PhotoMaker

PhotoMaker

https://photo-maker.github.io/

Other

8.63k stars 676 forks source link

Quick Question: what is the actual training dataset volume? #81

Open bsun0802 opened 5 months ago

bsun0802 commented 5 months ago

Hi Author,

Thanks for the great work. I've been curious in the "ID-Oriented Human Data Construction" described in paper, after ID verification and other filtering, what is the final training dataset volume?

I'm curious about how many training data, will empower the ID-preservation for a diffusion model, e.g., 10k, 100k, or 1M, etc. Would love to see an ablation for how the model progress when fed with different magnitude of data size.

Thanks!

Paper99 commented 5 months ago

Hi, for the model reported in the paper, we used filtered 110K images to train. No in-depth exploration of changes in the scale of training datasets.