buxiangzhiren / DDCap

MIT License
83 stars 11 forks source link

Can I train this model on Flickr30k? #23

Open vinevix opened 1 year ago

buxiangzhiren commented 1 year ago

You could train the model on any dataset as long as it contains the image-text pair. Maybe you need to find the average length of all captions on Flickr30k. And then you need to adjust the steps of diffusion according to the average length.