Closed Gynjn closed 9 months ago
Sorry for the confusion.
When setting lora_v
== True, we are using two separate models for $\theta$ model and $\phi$ model.
And this unet_cross_attention_kwargs
is only meant to be used when we use one model for both $\theta$ and $\phi$ models (see comment: https://github.com/yuanzhi-zhu/prolific_dreamer2d/blob/main/model_utils.py#L245).
First, thank you for the great implementation!
https://github.com/yuanzhi-zhu/prolific_dreamer2d/blob/main/model_utils.py#L242
Could you describe why the lora scale "0" is depends on lora_v?