pkuliyi2015 / sd-webui-stablesr

StableSR for Stable Diffusion WebUI - Ultra High-quality Image Upscaler
https://iceclear.github.io/projects/stablesr/
Other
1.02k stars 55 forks source link

Error fixing color in A1111 v1.6.0 #49

Open rafaelfabres opened 1 year ago

rafaelfabres commented 1 year ago

Since the 1.6 update of A1111 the color correction no longer works, this error is occurring:

[StableSR] Error fixing color with default method: Given groups=3, weight of size [3, 1, 3, 3], expected input[1, 4, 2474, 3722] to have 3 channels, but got 4 channels instead

FusionDraw9257 commented 1 year ago

I roll back to version 1.5.1... SD>CMD>"git checkout tags/v1.5.1" delete SD>venv Re Strat

WSJUSA commented 1 year ago

I confirm using Automatic1111 ver 1.5.1 resolves the color issue.

I did this with StabilityMatrix so I did not have to roll back the 1.6 version or flop between git checkouts. You can install both in Packages side by side.

Small annoyance is you do have to copy the StableSR model webui_768v_139.ckpt to the extension in both copies of the SR plugin.

WSJUSA commented 11 months ago

init_img which functions as the color style correction guide to colorfix, is now being passed to colorfix with 4 channels. This appears to be due to something having changed in how p StableDiffusionProcessingImg2Img prepares the source image.

as a possible fix is check all in bound tensors to colorfix and set their channel to 3

keep in mind, I have no idea what I am really doing, but this seems to work, added to the top of colorfix.py:

def channel_four_to_three(image: Tensor):
    # if tensor has 4 channels reduce to 3
    if image.shape[1] > 3:
        image = image[:, :3, :, :]
    return image

def adain_color_fix(target: Image, source: Image):
    # Convert images to tensors
    to_tensor = ToTensor()
    target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0))
    source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))

    # Apply adaptive instance normalization
    result_tensor = adaptive_instance_normalization(target_tensor, source_tensor)

    # Convert tensor back to image
    to_image = ToPILImage()
    result_image = to_image(result_tensor.squeeze(0).clamp_(0.0, 1.0))

    return result_image

def wavelet_color_fix(target: Image, source: Image):
    # Convert images to tensors
    to_tensor = ToTensor()
    target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0))
    source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))
snoopytl commented 8 months ago

init_img which functions as the color style correction guide to colorfix, is now being passed to colorfix with 4 channels. This appears to be due to something having changed in how p StableDiffusionProcessingImg2Img prepares the source image.

as a possible fix is check all in bound tensors to colorfix and set their channel to 3

keep in mind, I have no idea what I am really doing, but this seems to work, added to the top of colorfix.py:

def channel_four_to_three(image: Tensor):
    # if tensor has 4 channels reduce to 3
    if image.shape[1] > 3:
        image = image[:, :3, :, :]
    return image

def adain_color_fix(target: Image, source: Image):
    # Convert images to tensors
    to_tensor = ToTensor()
    target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0))
    source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))

    # Apply adaptive instance normalization
    result_tensor = adaptive_instance_normalization(target_tensor, source_tensor)

    # Convert tensor back to image
    to_image = ToPILImage()
    result_image = to_image(result_tensor.squeeze(0).clamp_(0.0, 1.0))

    return result_image

def wavelet_color_fix(target: Image, source: Image):
    # Convert images to tensors
    to_tensor = ToTensor()
    target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0))
    source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))

借助这位大佬的代码,在我这还是会报错,于是继续修改了一下,修改后就正常了1.6和1.7都能用,代码如下,可以直接替换colorfix.py中的内容, ` import torch from PIL import Image from torch import Tensor from torch.nn import functional as F

from torchvision.transforms import ToTensor, ToPILImage

def channel_four_to_three(image: Tensor):

if tensor has 4 channels reduce to 3

if image.shape[1] > 3:
    image = image[:, :3, :, :]
return image

def adain_color_fix(target: Image, source: Image):

Convert images to tensors

to_tensor = ToTensor()
target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0))
source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))

# Apply adaptive instance normalization
result_tensor = adaptive_instance_normalization(target_tensor, source_tensor)

# Convert tensor back to image
to_image = ToPILImage()
result_image = to_image(result_tensor.squeeze(0).clamp_(0.0, 1.0))

return result_image

def wavelet_color_fix(target: Image, source: Image):

Convert images to tensors

to_tensor = ToTensor()
target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0))
source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))

这是加的

# Apply wavelet reconstruction
result_tensor = wavelet_reconstruction(target_tensor, source_tensor)

# Convert tensor back to image
to_image = ToPILImage()
result_image = to_image(result_tensor.squeeze(0).clamp_(0.0, 1.0))

return result_image

加的结束

def adain_color_fix(target: Image, source: Image):

Convert images to tensors

to_tensor = ToTensor()

target_tensor = to_tensor(target).unsqueeze(0)

source_tensor = to_tensor(source).unsqueeze(0)

#

Apply adaptive instance normalization

result_tensor = adaptive_instance_normalization(target_tensor, source_tensor)

#

Convert tensor back to image

to_image = ToPILImage()

result_image = to_image(resulttensor.squeeze(0).clamp(0.0, 1.0))

#

return result_image

#

def wavelet_color_fix(target: Image, source: Image):

Convert images to tensors

to_tensor = ToTensor()

target_tensor = to_tensor(target).unsqueeze(0)

source_tensor = to_tensor(source).unsqueeze(0)

#

Apply wavelet reconstruction

result_tensor = wavelet_reconstruction(target_tensor, source_tensor)

#

Convert tensor back to image

to_image = ToPILImage()

result_image = to_image(resulttensor.squeeze(0).clamp(0.0, 1.0))

#

return result_image

def calc_mean_std(feat: Tensor, eps=1e-5): """Calculate mean and std for adaptive_instance_normalization. Args: feat (Tensor): 4D tensor. eps (float): A small value added to the variance to avoid divide-by-zero. Default: 1e-5. """ size = feat.size() assert len(size) == 4, 'The input feature should be 4D tensor.' b, c = size[:2] feat_var = feat.view(b, c, -1).var(dim=2) + eps feat_std = feat_var.sqrt().view(b, c, 1, 1) feat_mean = feat.view(b, c, -1).mean(dim=2).view(b, c, 1, 1) return feat_mean, feat_std

def adaptive_instance_normalization(content_feat:Tensor, style_feat:Tensor): """Adaptive instance normalization. Adjust the reference features to have the similar color and illuminations as those in the degradate features. Args: content_feat (Tensor): The reference feature. style_feat (Tensor): The degradate features. """ size = content_feat.size() style_mean, style_std = calc_mean_std(style_feat) content_mean, content_std = calc_mean_std(content_feat) normalized_feat = (content_feat - content_mean.expand(size)) / content_std.expand(size) return normalized_feat * style_std.expand(size) + style_mean.expand(size)

def wavelet_blur(image: Tensor, radius: int): """ Apply wavelet blur to the input tensor. """

input shape: (1, 3, H, W)

# convolution kernel
kernel_vals = [
    [0.0625, 0.125, 0.0625],
    [0.125, 0.25, 0.125],
    [0.0625, 0.125, 0.0625],
]
kernel = torch.tensor(kernel_vals, dtype=image.dtype, device=image.device)
# add channel dimensions to the kernel to make it a 4D tensor
kernel = kernel[None, None]
# repeat the kernel across all input channels
kernel = kernel.repeat(3, 1, 1, 1)
image = F.pad(image, (radius, radius, radius, radius), mode='replicate')
# apply convolution
output = F.conv2d(image, kernel, groups=3, dilation=radius)
return output

def wavelet_decomposition(image: Tensor, levels=5): """ Apply wavelet decomposition to the input tensor. This function only returns the low frequency & the high frequency. """ high_freq = torch.zeros_like(image) for i in range(levels): radius = 2 ** i low_freq = wavelet_blur(image, radius) high_freq += (image - low_freq) image = low_freq

return high_freq, low_freq

def wavelet_reconstruction(content_feat:Tensor, style_feat:Tensor): """ Apply wavelet decomposition, so that the content will have the same color as the style. """

calculate the wavelet decomposition of the content feature

content_high_freq, content_low_freq = wavelet_decomposition(content_feat)
del content_low_freq
# calculate the wavelet decomposition of the style feature
style_high_freq, style_low_freq = wavelet_decomposition(style_feat)
del style_high_freq
# reconstruct the content feature with the style's high frequency
return content_high_freq + style_low_freq

`

skywalker0113 commented 6 months ago

init_img which functions as the color style correction guide to colorfix, is now being passed to colorfix with 4 channels. This appears to be due to something having changed in how p StableDiffusionProcessingImg2Img prepares the source image. as a possible fix is check all in bound tensors to colorfix and set their channel to 3 keep in mind, I have no idea what I am really doing, but this seems to work, added to the top of colorfix.py:

def channel_four_to_three(image: Tensor):
    # if tensor has 4 channels reduce to 3
    if image.shape[1] > 3:
        image = image[:, :3, :, :]
    return image

def adain_color_fix(target: Image, source: Image):
    # Convert images to tensors
    to_tensor = ToTensor()
    target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0))
    source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))

    # Apply adaptive instance normalization
    result_tensor = adaptive_instance_normalization(target_tensor, source_tensor)

    # Convert tensor back to image
    to_image = ToPILImage()
    result_image = to_image(result_tensor.squeeze(0).clamp_(0.0, 1.0))

    return result_image

def wavelet_color_fix(target: Image, source: Image):
    # Convert images to tensors
    to_tensor = ToTensor()
    target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0))
    source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))

借助这位大佬的代码,在我这还是会报错,于是继续修改了一下,修改后就正常了1.6和1.7都能用,代码如下,可以直接替换colorfix.py中的内容, ` import torch from PIL import Image from torch import Tensor from torch.nn import functional as F

from torchvision.transforms import ToTensor, ToPILImage

def channel_four_to_three(image: Tensor): # if tensor has 4 channels reduce to 3 if image.shape[1] > 3: image = image[:, :3, :, :] return image

def adain_color_fix(target: Image, source: Image): # Convert images to tensors to_tensor = ToTensor() target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0)) source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))

# Apply adaptive instance normalization
result_tensor = adaptive_instance_normalization(target_tensor, source_tensor)

# Convert tensor back to image
to_image = ToPILImage()
result_image = to_image(result_tensor.squeeze(0).clamp_(0.0, 1.0))

return result_image

def wavelet_color_fix(target: Image, source: Image): # Convert images to tensors to_tensor = ToTensor() target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0)) source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0)) #这是加的 # Apply wavelet reconstruction result_tensor = wavelet_reconstruction(target_tensor, source_tensor)

# Convert tensor back to image
to_image = ToPILImage()
result_image = to_image(result_tensor.squeeze(0).clamp_(0.0, 1.0))

return result_image

加的结束

def adain_color_fix(target: Image, source: Image):

Convert images to tensors

to_tensor = ToTensor()

target_tensor = to_tensor(target).unsqueeze(0)

source_tensor = to_tensor(source).unsqueeze(0)

Apply adaptive instance normalization

result_tensor = adaptive_instance_normalization(target_tensor, source_tensor)

Convert tensor back to image

to_image = ToPILImage()

result_image = to_image(resulttensor.squeeze(0).clamp(0.0, 1.0))

return result_image

def wavelet_color_fix(target: Image, source: Image):

Convert images to tensors

to_tensor = ToTensor()

target_tensor = to_tensor(target).unsqueeze(0)

source_tensor = to_tensor(source).unsqueeze(0)

Apply wavelet reconstruction

result_tensor = wavelet_reconstruction(target_tensor, source_tensor)

Convert tensor back to image

to_image = ToPILImage()

result_image = to_image(resulttensor.squeeze(0).clamp(0.0, 1.0))

return result_image

def calc_mean_std(feat: Tensor, eps=1e-5): """Calculate mean and std for adaptive_instance_normalization. Args: feat (Tensor): 4D tensor. eps (float): A small value added to the variance to avoid divide-by-zero. Default: 1e-5. """ size = feat.size() assert len(size) == 4, 'The input feature should be 4D tensor.' b, c = size[:2] feat_var = feat.view(b, c, -1).var(dim=2) + eps feat_std = feat_var.sqrt().view(b, c, 1, 1) feat_mean = feat.view(b, c, -1).mean(dim=2).view(b, c, 1, 1) return feat_mean, feat_std

def adaptive_instance_normalization(content_feat:Tensor, style_feat:Tensor): """Adaptive instance normalization. Adjust the reference features to have the similar color and illuminations as those in the degradate features. Args: content_feat (Tensor): The reference feature. style_feat (Tensor): The degradate features. """ size = content_feat.size() style_mean, style_std = calc_mean_std(style_feat) content_mean, content_std = calc_mean_std(content_feat) normalized_feat = (content_feat - content_mean.expand(size)) / content_std.expand(size) return normalized_feat * style_std.expand(size) + style_mean.expand(size)

def wavelet_blur(image: Tensor, radius: int): """ Apply wavelet blur to the input tensor. """ # input shape: (1, 3, H, W) # convolution kernel kernel_vals = [ [0.0625, 0.125, 0.0625], [0.125, 0.25, 0.125], [0.0625, 0.125, 0.0625], ] kernel = torch.tensor(kernel_vals, dtype=image.dtype, device=image.device) # add channel dimensions to the kernel to make it a 4D tensor kernel = kernel[None, None] # repeat the kernel across all input channels kernel = kernel.repeat(3, 1, 1, 1) image = F.pad(image, (radius, radius, radius, radius), mode='replicate') # apply convolution output = F.conv2d(image, kernel, groups=3, dilation=radius) return output

def wavelet_decomposition(image: Tensor, levels=5): """ Apply wavelet decomposition to the input tensor. This function only returns the low frequency & the high frequency. """ high_freq = torch.zeros_like(image) for i in range(levels): radius = 2 ** i low_freq = wavelet_blur(image, radius) high_freq += (image - low_freq) image = low_freq

return high_freq, low_freq

def wavelet_reconstruction(content_feat:Tensor, style_feat:Tensor): """ Apply wavelet decomposition, so that the content will have the same color as the style. """ # calculate the wavelet decomposition of the content feature content_high_freq, content_low_freq = wavelet_decomposition(content_feat) del content_low_freq # calculate the wavelet decomposition of the style feature style_high_freq, style_low_freq = wavelet_decomposition(style_feat) del style_high_freq # reconstruct the content feature with the style's high frequency return content_high_freq + style_low_freq

`

亲测有效,就是格式乱七八糟 Personally tested it to be effective, but the format was messy and disorganized.

yw-2020 commented 5 months ago

加的结束

init_img which functions as the color style correction guide to colorfix, is now being passed to colorfix with 4 channels. This appears to be due to something having changed in how p StableDiffusionProcessingImg2Img prepares the source image. as a possible fix is check all in bound tensors to colorfix and set their channel to 3 keep in mind, I have no idea what I am really doing, but this seems to work, added to the top of colorfix.py:

def channel_four_to_three(image: Tensor):
    # if tensor has 4 channels reduce to 3
    if image.shape[1] > 3:
        image = image[:, :3, :, :]
    return image

def adain_color_fix(target: Image, source: Image):
    # Convert images to tensors
    to_tensor = ToTensor()
    target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0))
    source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))

    # Apply adaptive instance normalization
    result_tensor = adaptive_instance_normalization(target_tensor, source_tensor)

    # Convert tensor back to image
    to_image = ToPILImage()
    result_image = to_image(result_tensor.squeeze(0).clamp_(0.0, 1.0))

    return result_image

def wavelet_color_fix(target: Image, source: Image):
    # Convert images to tensors
    to_tensor = ToTensor()
    target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0))
    source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))

借助这位大佬的代码,在我这还是会报错,于是继续修改了一下,修改后就正常了1.6和1.7都能用,代码如下,可以直接替换colorfix.py中的内容, ` import torch from PIL import Image from torch import Tensor from torch.nn import functional as F from torchvision.transforms import ToTensor, ToPILImage def channel_four_to_three(image: Tensor): # if tensor has 4 channels reduce to 3 if image.shape[1] > 3: image = image[:, :3, :, :] return image def adain_color_fix(target: Image, source: Image): # Convert images to tensors to_tensor = ToTensor() target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0)) source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0))

# Apply adaptive instance normalization
result_tensor = adaptive_instance_normalization(target_tensor, source_tensor)

# Convert tensor back to image
to_image = ToPILImage()
result_image = to_image(result_tensor.squeeze(0).clamp_(0.0, 1.0))

return result_image

def wavelet_color_fix(target: Image, source: Image): # Convert images to tensors to_tensor = ToTensor() target_tensor = channel_four_to_three(to_tensor(target).unsqueeze(0)) source_tensor = channel_four_to_three(to_tensor(source).unsqueeze(0)) #这是加的 # Apply wavelet reconstruction result_tensor = wavelet_reconstruction(target_tensor, source_tensor)

# Convert tensor back to image
to_image = ToPILImage()
result_image = to_image(result_tensor.squeeze(0).clamp_(0.0, 1.0))

return result_image

加的结束

def adain_color_fix(target: Image, source: Image):

Convert images to tensors

to_tensor = ToTensor()

target_tensor = to_tensor(target).unsqueeze(0)

source_tensor = to_tensor(source).unsqueeze(0)

Apply adaptive instance normalization

result_tensor = adaptive_instance_normalization(target_tensor, source_tensor)

Convert tensor back to image

to_image = ToPILImage()

result_image = to_image(resulttensor.squeeze(0).clamp(0.0, 1.0))

return result_image

def wavelet_color_fix(target: Image, source: Image):

Convert images to tensors

to_tensor = ToTensor()

target_tensor = to_tensor(target).unsqueeze(0)

source_tensor = to_tensor(source).unsqueeze(0)

Apply wavelet reconstruction

result_tensor = wavelet_reconstruction(target_tensor, source_tensor)

Convert tensor back to image

to_image = ToPILImage()

result_image = to_image(resulttensor.squeeze(0).clamp(0.0, 1.0))

return result_image

def calc_mean_std(feat: Tensor, eps=1e-5): """Calculate mean and std for adaptive_instance_normalization. Args: feat (Tensor): 4D tensor. eps (float): A small value added to the variance to avoid divide-by-zero. Default: 1e-5. """ size = feat.size() assert len(size) == 4, 'The input feature should be 4D tensor.' b, c = size[:2] feat_var = feat.view(b, c, -1).var(dim=2) + eps feat_std = feat_var.sqrt().view(b, c, 1, 1) feat_mean = feat.view(b, c, -1).mean(dim=2).view(b, c, 1, 1) return feat_mean, feat_std def adaptive_instance_normalization(content_feat:Tensor, style_feat:Tensor): """Adaptive instance normalization. Adjust the reference features to have the similar color and illuminations as those in the degradate features. Args: content_feat (Tensor): The reference feature. style_feat (Tensor): The degradate features. """ size = content_feat.size() style_mean, style_std = calc_mean_std(style_feat) content_mean, content_std = calc_mean_std(content_feat) normalized_feat = (content_feat - content_mean.expand(size)) / content_std.expand(size) return normalized_feat * style_std.expand(size) + style_mean.expand(size) def wavelet_blur(image: Tensor, radius: int): """ Apply wavelet blur to the input tensor. """ # input shape: (1, 3, H, W) # convolution kernel kernel_vals = [ [0.0625, 0.125, 0.0625], [0.125, 0.25, 0.125], [0.0625, 0.125, 0.0625], ] kernel = torch.tensor(kernel_vals, dtype=image.dtype, device=image.device) # add channel dimensions to the kernel to make it a 4D tensor kernel = kernel[None, None] # repeat the kernel across all input channels kernel = kernel.repeat(3, 1, 1, 1) image = F.pad(image, (radius, radius, radius, radius), mode='replicate') # apply convolution output = F.conv2d(image, kernel, groups=3, dilation=radius) return output def wavelet_decomposition(image: Tensor, levels=5): """ Apply wavelet decomposition to the input tensor. This function only returns the low frequency & the high frequency. """ high_freq = torch.zeros_like(image) for i in range(levels): radius = 2 ** i low_freq = wavelet_blur(image, radius) high_freq += (image - low_freq) image = low_freq

return high_freq, low_freq

def wavelet_reconstruction(content_feat:Tensor, style_feat:Tensor): """ Apply wavelet decomposition, so that the content will have the same color as the style. """ # calculate the wavelet decomposition of the content feature content_high_freq, content_low_freq = wavelet_decomposition(content_feat) del content_low_freq # calculate the wavelet decomposition of the style feature style_high_freq, style_low_freq = wavelet_decomposition(style_feat) del style_high_freq # reconstruct the content feature with the style's high frequency return content_high_freq + style_low_freq `

亲测有效,就是格式乱七八糟 Personally tested it to be effective, but the format was messy and disorganized. 啥意思,什么格式乱七八糟,不就是将修改的函数替换了就行吗?

sipie800 commented 1 month ago

+1