pymatting / foreground-estimation-evaluation

Evaluate the quality of foreground estimation methods
MIT License
2 stars 0 forks source link

Silent quantization affecting evaluation, from saving estimated foreground as bmp #2

Open MarcoForte opened 1 year ago

MarcoForte commented 1 year ago

Hi, thanks again for your great work. I am coming back to it again now to develop a further optimised version.

I noticed that when you are saving the foreground you quantize the image. Maybe saving as a TIFF could avoid this. https://github.com/pymatting/foreground-estimation-evaluation/blob/d13bb0657df502e32da18235beb984bacaa50591/scripts/util.py#L76

I point this out so that people can better reproduce and compare to this work.

99991 commented 1 year ago

You are correct that quantization affects evaluation and I agree that it would be better to work with higher precision.

The reason why we use quantization anyway is that it is common practice for alpha matting to save the images in an 8 bit image format before evaluation:

With large datasets, a factor of two or four in storage cost can be prohibitive. The bigger issue is that floor rounding is usually used instead of nearest neighbor rounding during quantization. image = np.clip(image * 255 + 0.5, 0, 255).astype(np.uint8) would be more precise, but it is what it is. At least the relative ranking of methods should stay mostly the same since they are all biased in the same way.

Here is an example comparing no quantization, floor rounding and nearest neighbor rounding:

import sys
sys.path.append("scripts")
import os, json, util
import numpy as np
import pymatting

name = "GT01"
directory = util.find_data_directory()
true_foreground = util.load_image(f"{directory}/converted/foreground/{name}.bmp")
alpha = util.load_image(f"{directory}/gt_training_highres/{name}.png", "gray")
image = util.load_image(f"{directory}/converted/image/{name}.bmp")
is_unknown = np.logical_and(alpha > 0, alpha < 1)

estimated_foreground = pymatting.estimate_foreground_ml(
    image,
    alpha,
    gradient_weight=0.1,
    regularization=5e-3,
    n_small_iterations=10,
    n_big_iterations=2,
)

for quantization, estimated_foreground in [
    ("no quantization", estimated_foreground),
    ("floor rounding", np.clip(estimated_foreground * 255, 0, 255).astype(np.uint8) / 255.0),
    ("nearest rounding", np.clip(estimated_foreground * 255 + 0.5, 0, 255).astype(np.uint8) / 255.0),
]:
    print(f"MSE {quantization:16}: {util.calculate_mse_error(estimated_foreground, true_foreground, is_unknown, alpha):.10f}")
rounding MSE
no quantization 0.0013126050
floor rounding 0.0012949729
nearest rounding 0.0013132484

The errors are small and have not been relevant yet, but if foreground estimation methods become more precise, it might become necessary to fork this repository and create a version of it with higher precision. I still want to keep this repository for reproducibility.

Is TIFF the best image format for this application? Pillow does not seem to support high precision multi channel images yet https://github.com/python-pillow/Pillow/issues/1888

It would also be nice if image viewers could display the images.