sczhou / CodeFormer

[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
Other
14.96k stars 3.19k forks source link

Output of Codeformer is a black image #222

Open ChrisDeantana opened 1 year ago

ChrisDeantana commented 1 year ago

Hi everyone, Ive encountered some problem when trying the google collab version of the codeformer demo. It seems fine for some images, but I've found out that there are some image whose output will generate all black pixel.

p.s. I didnt change any code from the current google collab.

Does anyone have the same issues? Thanks for all you guys help

This is the image with the output. the image on the left side doesnt have any white box, i just added those box to ask this problem ask

JeremyLinky commented 12 months ago

Same problem. Is there anyone have a solution???

bearwithdog commented 1 month ago

is there “Grayscale input:True” in your process log? you can fix facelib/utils/face_restoration_helper.py:144 self.is_gray = is_gray(img, threshold=10)->self.is_gray = is_gray(img, threshold=0) the reaseon is that image considered by Gray image.

kennethxu commented 1 month ago

Got the same problem here and after dig deeper I think I found the reason for the failure and a fix.

The problem comes from the line 368 of facelib/utils/face_restoration_helper.py restored_face = adain_npy(restored_face, input_face) # transfer the color

After this line, restored_face has values greater than 256 for 8-bit image, which caused it to be later treated as 16-bit image. But since the values are so small (most less than 256/65536 which is effectively [1,1,1] in RGB space) that the image is just black.

I confirmed this by added a few debug lines around the questioned code:

    def add_restored_face(self, restored_face, input_face=None):
        if self.is_gray:
            print("before convert to gray ", np.max(restored_face))
            restored_face = bgr2gray(restored_face) # convert img into grayscale
            print("after convert to gray ", np.max(restored_face))
            if input_face is not None:
                restored_face = adain_npy(restored_face, input_face) # transfer the color
                print("after transfer the color ", np.max(restored_face))
        self.restored_faces.append(restored_face)

And got below output, you can see that the two face image value exceed 256.

>python inference_codeformer.py --bg_upsampler realesrgan --face_upsample -w 0.7 --input_path ...
Face detection model: retinaface_resnet50
Background upsampling: True, Face upsampling: True
[1/1] Processing: 20230820_112323-2000-4.jpg
Grayscale input: True
        detect 4 faces
before convert to gray  255
after convert to gray  244.7798
after transfer the color  249.60674644989007
before convert to gray  236
after convert to gray  223.9685
after transfer the color  227.6070183984604
before convert to gray  255
after convert to gray  254.97449999999998
after transfer the color  260.24531282356395
before convert to gray  255
after convert to gray  254.97449999999998
after transfer the color  259.532129300522
        Input is a 16-bit image
        Input is a 16-bit image

All results are saved in results/test_img_0.7

The fix is to just normalize the values in restored_face after transferring the color.

    def add_restored_face(self, restored_face, input_face=None):
        if self.is_gray:
            restored_face = bgr2gray(restored_face) # convert img into grayscale
            if input_face is not None:
                restored_face = adain_npy(restored_face, input_face) # transfer the color
                # Fix too large values for 8-bit image
                max_range = np.max(restored_face)
                if max_range >= 256 and max_range <= 280:
                    restored_face = restored_face * 256.0 / max_range
        self.restored_faces.append(restored_face)

While this fixes the problem, I still don't understand why do we need to transfer color to a grayscale image, what's the purpose? and why the resulted values exceed 8-bit image max?

kennethxu commented 1 month ago

So based on my very limited knowledge of image processing, the adain_npy is trying to match the brightness and contrast of the original image. But the code that shift the image can actually make some pixel too bright (>256 for 8bit or >65536 for 16bit) or too dark (<0), which in turn cased the black issue for 8 bit images per detailed in my previous comment.

I have added code to reduce the contrast when this happens to ensure pixel values are all within the valid range. Here is the updaated adain_npy method in facelib/utils/misc.py file.

def adain_npy(content_feat, style_feat):
    """Adaptive instance normalization for numpy.

    Args:
        content_feat (numpy): The input feature.
        style_feat (numpy): The reference feature.
    """
    size = content_feat.shape
    style_mean, style_std = calc_mean_std(style_feat)
    content_mean, content_std = calc_mean_std(content_feat)
    normalized_feat = (content_feat - np.broadcast_to(content_mean, size)) / np.broadcast_to(content_std, size)
    result_feat = normalized_feat * np.broadcast_to(style_std, size) + np.broadcast_to(style_mean, size)

    # Ensure values are within the range of image bits
    max_range = 256 if np.max(content_feat) < 256 else 65536 # determine 8 bit or 16 bit.
    min_value, max_value = np.min(result_feat), np.max(result_feat)
    if max_value > max_range or min_value < 0:
        # value is out of the range, requires adjustment.
        mean_value = np.min(style_mean)
        ratio = 0.9999 - max((max_value - max_range) / (max_value - mean_value), -min_value / (mean_value - min_value))
        style_std = style_std * np.broadcast_to([ratio, ratio, ratio], style_std.shape)
        result_feat = normalized_feat * np.broadcast_to(style_std, size) + np.broadcast_to(style_mean, size)

    return result_feat
kennethxu commented 1 month ago

More polished code to guarantee clamping the pixel values to bit range.

def adain_npy(content_feat, style_feat):
    """Adaptive instance normalization for numpy.

    Args:
        content_feat (numpy): The input feature.
        style_feat (numpy): The reference feature.
    """
    size = content_feat.shape
    style_mean, style_std = calc_mean_std(style_feat)
    content_mean, content_std = calc_mean_std(content_feat)
    normalized_feat = (content_feat - np.broadcast_to(content_mean, size)) / np.broadcast_to(content_std, size)
    result_feat = normalized_feat * np.broadcast_to(style_std, size) + np.broadcast_to(style_mean, size)

    # Ensure values are within the range of image bits
    bit_range = 256 if np.max(content_feat) < 256 else 65536 # determine 8 bit or 16 bit.
    a_min, a_max = np.min(result_feat, axis=(0,1)), np.max(result_feat, axis=(0,1))
    i_min, i_max = np.argmin(a_min), np.argmax(a_max) # find the color index of min and max
    v_min, v_max = a_min[i_min], a_max[i_max]
    if v_max > bit_range or v_min < 0: # pixel value is out of the bit range
        # reduce the style_std to clamp values in range.
        mean_min, mean_max = style_mean[0][0][i_min], style_mean[0][0][i_max]
        ratio = min(mean_min / (mean_min - v_min), (bit_range - 1e-12 - mean_max) / (v_max - mean_max)) 
        style_std = style_std * np.broadcast_to([ratio, ratio, ratio], style_std.shape)
        result_feat = normalized_feat * np.broadcast_to(style_std, size) + np.broadcast_to(style_mean, size)

    return result_feat