nagadomi / nunif

Misc; latest version of waifu2x; 2D video to stereo 3D video conversion
MIT License
1.59k stars 145 forks source link

Foreground Scale for Any V2 #255

Open math-artist opened 2 days ago

math-artist commented 2 days ago

First, thank you for this super useful app. I have only started using iw3 yesterday, and I already have over one hundred files converted (small files, many are samples for testing)

During my testing, I found annoying that the model Any_V2 seemed truncated when setting the foreground scale to 1. So, I have plot the curves to see what happens.

image

I use Any V2 in another project, and the depth maps close to 0 are the furthest, and the higher ones are closer. So, what foreground scale is doing, is it is flattening the background for a very small gain in the slope near 1. And that's exactly what I am seeing when I am using it. I think it's implemented backward.

I wrote this code derived from a function you made that has, I think, the correct transform for Any V2.

def inv_softplus01_edited(x, bias, scale):
    min_v = ((torch.zeros(1, dtype=x.dtype, device=x.device) - bias) * scale).expm1().clamp(min=1e-6).log()
    max_v = ((torch.ones(1, dtype=x.dtype, device=x.device) - bias) * scale).expm1().clamp(min=1e-6).log()
    v = ((1 - x - bias) * scale).expm1().clamp(min=1e-6).log()
    return 1 - (v - min_v) / (max_v - min_v)

image

nagadomi commented 1 day ago

Thanks for the info. I too thought the current conversion curve for Depth-Anything was not good, but since I don't use it myself, I left it alone for a long time. The current expression is just a smooth function of x > 0.5 ? (x - 0.5) * 2 : 0 as you say. I will try to organize knowledge of that area at this time.