Closed ryancll closed 5 months ago
Hi @ryancll,
There might be some misunderstanding. The conv_in.bias is also trainable. You can print the weight in pretrained stablediffusino-v1-5 and PIA for comparison.
@ymzhang0319 When loading DreamBooth weights, you use the conv_in.bias
from DreamBooth weights instead of the PIA weights, right?
As for the difference in weights in pretrained stablediffusino-v1-5 and PIA, this can also be one of the reasons.
@Tianhao-Qi T, we introduced our training method in section 3.3. Following the training strategy of animatediff, we first train a domain adapter on webvid. As animatediff has not released the weights for their LoRA version of the domain adapter, we directly fine-tune the entire UNet, transforming it into a 'domain adapter' for webvid. Originally posted by @LeoXing1996 in https://github.com/open-mmlab/PIA/issues/32#issuecomment-1877016187
According to your paper and code, you only update the conv_in.weight and temporal layers. Is there any solid reason or ablation experiments to prove that keeping conv_in.bias frozen can achieve better performance?