Closed yupeng1111 closed 1 year ago
Besides, I calculate the mean(0.0327) and std(0.9966) of learnable_vector
and it's more like a random Gussian sampling so I'm doubting whether its value changed or not during training.
We appreciate your interest in our work. I apologize for accidentally writing this bug during the code cleanup. This bug has now been fixed. Regarding the learnable vector resembling Gaussian noise, I believe it is caused by the two factors listed below. 1) It is initialized by a Gaussian distribution. 2) Because the learning rate is small, it does not deviate too far from the initial value.
We have also retrained the network and tracked the value and gradient of the learnable vector during the training process. As a result, the gradient of the learnable vector is quite small.
Thansks!
I also have the same question about the changes in the parameters of to_k, to_q in cross-attention module,I tried iterating the network many times, but there is no change in the parameters of these two parts. The guess is caused by that only one-dimensional class vectors are selected as cond. https://github.com/Fantasy-Studio/Paint-by-Example/issues/ Have you retrained the code and have you encountered the same problem? Looking forward to hearing from you!
Nice work! I'm wondering that how you optimize the unconditional vector in https://github.com/Fantasy-Studio/Paint-by-Example/blob/c435d8a8d7014d58b5bb1f2af69acd04acb01969/ldm/models/diffusion/ddpm.py#L1437
I find the intention that the unconditional input
self.learnable_vector
is optimized byself.opt.params=self.params_with_white
in line 927 but how it works? I've not found any instructions about the parameterparams
in classtorch.optim.AdamW
.