FrozenBurning / Text2Light

[SIGGRAPH Asia 2022] Text2Light: Zero-Shot Text-Driven HDR Panorama Generation
https://frozenburning.github.io/projects/text2light/
Other
577 stars 46 forks source link

About the contrastive loss #10

Closed ikuinen closed 1 year ago

ikuinen commented 1 year ago

It seems that the contrastive loss have no gradient on the network. The "gen_img_emb" is generated by fixed CLIP while the " psed_emb" is pre-computed in dataloader:

with torch.no_grad():
        x_sample_nopix = self.decode_to_img(index_sample, [index_sample.shape[0], 256, 8, 16]) #hack
        preprocess = _transform(224)
        gen_img_emb = self.clip.encode_image(preprocess(x_sample_nopix))
        gen_img_emb /= gen_img_emb.norm(dim=-1, keepdim=True)

        psed_emb = batch['psed_emb']
        sim = torch.cosine_similarity(gen_img_emb.unsqueeze(1), psed_emb.unsqueeze(0), dim=-1)

looking forward to your reply.

FrozenBurning commented 1 year ago

Thanks for your interest in our work! Indeed, the #hack line should be removed which allows the gradient to flow back to the transformer as we aim to update the global sampler. I've updated the code. https://github.com/FrozenBurning/Text2Light/blob/af8ec40412777c13d7f1739da0d9ca1de00bcc1f/taming/models/global_sampler.py#L167

Thanks for your feedback! 🍻

ikuinen commented 1 year ago

Thanks for your interest in our work! Indeed, the #hack line should be removed which allows the gradient to flow back to the transformer as we aim to update the global sampler. I've updated the code.

https://github.com/FrozenBurning/Text2Light/blob/af8ec40412777c13d7f1739da0d9ca1de00bcc1f/taming/models/global_sampler.py#L167

Thanks for your feedback! 🍻

Thanks for your reply! But the transformer is auto-regressive model which generates index step by step. The generated index will stop the gradient to the transformer. Does the loss item allow the gradient flow back to transformer?

Looking forward to your reply.

FrozenBurning commented 1 year ago

I got your point. And I've dug into our original implementation and updated the code.

Thanks for your feedback! 🍻

FrozenBurning commented 1 year ago

Closed due to inactivity. Feel free to reopen it 🙌