gligen / GLIGEN

Open-Set Grounded Text-to-Image Generation
MIT License
1.91k stars 145 forks source link

The sampling question #85

Open Emberss opened 3 weeks ago

Emberss commented 3 weeks ago

In the paper, I see the sampling trick: "we show that using the full model (all layers) in the first half of the sampling steps and only using the original layers (without the gated Transformer layers) in the latter half can lead to generation results that accurately reflect the grounding conditions while also having high image quality." But in the inference code, it seems this tirck is not implemented. Could you please tell the reason?

iLori-Jiang commented 1 week ago

That should refer to 'alpha_type = [0.3, 0.0, 0.7]' in the code