Closed fugokidi closed 1 year ago
Same question. Did you find it?
@XavierXiao I didn't find it. Probably it is not the latest code for the released stable diffusion. Anyway, we can do it on our own. If you want to do text drop out, one way may be, we can filter it in get_input
method in ddpm.py
. Text conditions are a list of captions, for example a batch size of 3, ['capiton one', 'caption two', 'caption three']
. I think it is easier to do it in numpy.
captions = ['caption one', 'caption two', 'caption three']
null_labels = [""] * len(captions)
prob = torch.rand((len(captions),))
filtered_captions = np.where(prob > 0.1, captions, null_labels).tolist()
I'm a newbie in diffusion. Sorry, I am not very friendly with the whole codebase. I might miss something.
If you're still looking it's here:
@justinpinkney Thank you so much. I'm clear now. Let me close this issue.
Interesting. This block does not exist in the official CompVis release...
@XavierXiao I think official CompVis release is mainly for inference. Pesser is really kind to share development repo here. Justin (@justinpinkney ) also share training details of image variation fork (https://github.com/justinpinkney/stable-diffusion), he shares all the training details. I am really grateful to both of them. I cannot understand things if I don't try. I can play around with this repo and makes me more understood about how stable diffusion works.
Thank you for this repo. It has more training related stuff, so I can try it on my own. Can you please point me where 10 % text conditioning dropout is happening? I'm afraid I will dropout twice if I dropout it on my own. Thank you again. LDM is really awesome.