gligen / GLIGEN

Open-Set Grounded Text-to-Image Generation
MIT License
2.02k stars 151 forks source link

Coco based training weights. #41

Open saksham-s opened 1 year ago

saksham-s commented 1 year ago

Thanks for sharing your work which is very helpful and interesting. I wanted to ask if you could share the coco trained weights without any of the large scale training using the bigger datasets. I see you discuss the coco trained results in the paper but I could not find them in the Github repository.

Naidala commented 1 year ago

Did you receive any reply?

cats-food commented 1 year ago

@saksham-s Hi, have you trained the model yourself? I am trying to reproduce the text-box grounding generation on COCO2014, but even if I trained for 200k iters, the results still do not follow the bbox. I do not know where it goes wrong, is it because the COCO2014 dataset is still not big enough?

maluyazilation commented 11 months ago

@cats-food My model successfully follow the bbox, but the fid score can not descend to 5.8 as paper said. Are you still following this work?

cats-food commented 11 months ago

@maluyazilation Thanks for the reply! I wonder what batch size and how many iters you trained on coco2014 when your model starts to follow the bbox?

I am not following the work recently since in my experiment my results do not even follow the bbox, i still do not know what's going on.

maluyazilation commented 11 months ago

@cats-food Your reply is really quick, thank you. I don't know if my settings are suitable. For coco2014cd-ldm setting with bz64, bbox following happens at 30000 to 40000 iters. I also train stable-diffusion in flickr dataset with bz32, it seems to take longer, about 160,000 to 180,000 iters.

maluyazilation commented 11 months ago

@cats-food If you are interested in reproducing this paper in the future, i'll be glad if we can communicate more in email, wechat or something else, thanks~ For coco-setting, i think there are still many details omitted in the paper, and i want to find them out. : ) email:xiaobo123@stu.xjtu.edu.cn

cats-food commented 11 months ago

Thanks for sharing your setting, so i think my problem should be my batch size is too small. I only used 4.

Anyway, for now I am not planning to reproduce the paper, but I am happy to discuss more

maluyazilation commented 11 months ago

@cats-food Ok, that's fine. Best wishes.

Hui-88 commented 1 month ago

@cats-food Hello!Have you successfully reproduced it now? I am facing the same problem, my bz is 2, I have trained 200000 iters, the results do not even follow the bbox. Is it because the bz too small ?