Text Conditioning Dropout

pesser / stable-diffusion

MIT License

1.03k stars 395 forks source link

Text Conditioning Dropout #21

Closed fugokidi closed 1 year ago

fugokidi commented 1 year ago

Thank you for this repo. It has more training related stuff, so I can try it on my own. Can you please point me where 10 % text conditioning dropout is happening? I'm afraid I will dropout twice if I dropout it on my own. Thank you again. LDM is really awesome.

XavierXiao commented 1 year ago

Same question. Did you find it?

fugokidi commented 1 year ago

@XavierXiao I didn't find it. Probably it is not the latest code for the released stable diffusion. Anyway, we can do it on our own. If you want to do text drop out, one way may be, we can filter it in get_input method in ddpm.py. Text conditions are a list of captions, for example a batch size of 3, ['capiton one', 'caption two', 'caption three']. I think it is easier to do it in numpy.

captions = ['caption one', 'caption two', 'caption three']
null_labels = [""] * len(captions)
prob = torch.rand((len(captions),))
filtered_captions = np.where(prob > 0.1, captions, null_labels).tolist()

I'm a newbie in diffusion. Sorry, I am not very friendly with the whole codebase. I might miss something.

justinpinkney commented 1 year ago

If you're still looking it's here:

https://github.com/pesser/stable-diffusion/blob/57eea7dfc2cdd8cadae77ab1c391f956d46f69bd/ldm/models/diffusion/ddpm.py#L396-L403

fugokidi commented 1 year ago

@justinpinkney Thank you so much. I'm clear now. Let me close this issue.

XavierXiao commented 1 year ago

Interesting. This block does not exist in the official CompVis release...

fugokidi commented 1 year ago

@XavierXiao I think official CompVis release is mainly for inference. Pesser is really kind to share development repo here. Justin (@justinpinkney ) also share training details of image variation fork (https://github.com/justinpinkney/stable-diffusion), he shares all the training details. I am really grateful to both of them. I cannot understand things if I don't try. I can play around with this repo and makes me more understood about how stable diffusion works.