Open llstela opened 2 months ago
I have the same problem. It seems that the problem arises when use_image_cross_attention=True
in UNet2DConditionModel.from_pretrained
. It is not used in some inference scripts.
Can someone clarify is it a mandatory setting for training?
I have the same problem. It seems that the problem arises when
use_image_cross_attention=True
inUNet2DConditionModel.from_pretrained
. It is not used in some inference scripts.Can someone clarify is it a mandatory setting for training?
I tried to overwrite the "UNet2DConditionModel.from_pretrained" function. It can work, but I'm not sure it's the right way to solve this problem.
May I ask if you get a mismatch between the output SR image and the input LR image at the beginning of the training?
Meanwhile, I specified the use_image_cross_attention=True
, --pretrained_model_name_or_path
, and --unet_model_name_or_path
, but I didn't catch the error.
@llstela @Zebraside
The error occurs when only --pretrained_model_name_or_path
is set. (# resume from pretrained SD
branch).
I'm using stable-diffusion-2-base
from HuggingFace as base model.
While loading UNet2DConditionModel from pretrained SD-2, ValueError appears, as parameter weights related to image_attentions are missed.