Irrational-Encoding-Wizardry / vsutil

A collection of general purpose Vapoursynth functions to be reused in modules and scripts
MIT License
28 stars 7 forks source link

scale_offsets=0 #53

Closed Asd-g closed 4 years ago

Asd-g commented 4 years ago

Hi, That value looks wrong. (16, 8, 16, range_in=range=0) = (16, 8, 16, range_in=range=1) (16, 8, 32, range_in=1) != (16, 8, 32, range_in=0, scale_offsets=0) but the result should be the same. scale_offsets=0 makes float keep tv range, right? Currently scale_offsets=0 is doing 16/219 = 0.0730593607305936. Make the same for 235/219 = 1.07... It should be x/255. That for luma. For chroma (x-128(8-bit))/255.

Btw here parentheses are missing for output_depth - 8.

kgrabs commented 4 years ago

scale_offsets is for whether or not to subtract 16 first, its gonna be dividing by 219 either way if range is set to limited.

(16, 8, 16, range_in=range=0) is 16 (56064 / 219) (16, 8, 16, range_in=range=1) is 16 (65535 / 255) (16, 8, 32, range_in=1) is 16 / 255 (16, 8, 32, range_in=0, scale_offsets=0) is 16 / 219

(16, 8, 16, range_in=range=0) = (16, 8, 16, range_in=range=1)

If that's true then I wrote it wrong

Btw here parentheses are missing for output_depth - 8.

whoops

Asd-g commented 4 years ago

If that's true then I wrote it wrong

I had to write (16, 8, 16, range_in=range=0) ~ (16, 8, 16, range_in=range=1) because the factor for full scale is relatively small 1.00x and for lower bit depth the result will be the same (10-bit for example).

(16, 8, 32, range_in=0, scale_offsets=0) is 16 / 219

What's the practical purpose of that? On the one hand any value > 219 makes float out of range (> 1) / any integer bit depth to be clipped to the max range and on the other hand it gives wrong values for those values that are not clipped/out of range. Like (range_in=0, range=1, scale_offsets=0), (range_in=1, range=0, scale_offsets=0) behaves similarly but for the values in the low end of the range.

kgrabs commented 4 years ago

What's the practical purpose of that?

any number of things, mostly involving std.Expr

On the one hand any value > 219 makes float out of range (> 1) / any integer bit depth to be clipped to the max range and on the other hand it gives wrong values for those values that are not clipped/out of range.

yes, there are many ways to use this function wrong

Asd-g commented 4 years ago

yes, there are many ways to use this function wrong

When the function returns wrong results you can't blame the others by saying usage is wrong.

  1. (scale_value(16, 8, 16, range_in=1) -> 4112 (correct value)
  2. (scale_value(16, 8, 16, range=0, range_in=1) -> 3518 (wrong value)
  3. (scale_value(16, 8, 16, range=0, range_in=1, scale_offsets=1) -> 7614 (wrong value)
  4. (scale_value(16, 8, 16, range=0, range_in=1, scale_offsets=0) -> 3518 (wrong value)

I expected correct result when I explicitly set the ranges (2) but that's a wrong usage of the function?

Another example:

  1. (scale_value(16, 8, 16, range=1) -> 4788 (wrong value)
  2. (scale_value(16, 8, 16, range=1, range_in=0) -> 4788 (wrong value)
  3. (scale_value(16, 8, 16, range=1, range_in=0, scale_offsets=0) -> 4788 (wrong value)
  4. (scale_value(16, 8, 16, range=1, range_in=0, scale_offsets=1) -> 0 (correct value)

I again expected correct result when I explicitly set the ranges (2) but again that's a wrong usage of the function?

In addition the correct values from both examples are result of quite different usage of the function.

Since @kageru closed the ticket it seems the participants in this ticket taking all results returned by the function as valid (I still want to see the expressions containing 0.0730593607305936 and 1.073059).

Lypheo commented 4 years ago

(scale_value(16, 8, 16, range=0, range_in=1, scale_offsets=1) -> 7614 (wrong value)

But 7614 is correct, no? 4112 is correct only if the output range is full too.

>>> 16/255 * ((235 << 8) - (16 << 8)) + (16 << 8)
7613.7411764705885

However, I do think that scale_offsets=False is a nonsensical default value.

Asd-g commented 4 years ago

Yes, you're right. It should be 7614. When there is range_limited<->range_full scale_offsets=1 should be set by default. Also should be set by default for integer(no matter full or limited)<->float. Also integer->float - if the integer range should be kept range_in=1 (no matter what's the actual range of integer), if integer range should be expanded (16->0) range_in=0. For no conversion between range_limited<->range_full scale_offsets doesn't have effect. When range_limited<->range_full scale_offsets=0 always returns not expected (wrong) values. And I will ask again - what's the real usage of scale_offsets=0? Why one function have to return non-expected results?

kgrabs commented 4 years ago

what's the real usage of scale_offsets=0?

Lets take debanding as an example. Say you have a movie that was converted to 8 bit with no dithering added and you wanna use a smoother at a high bitdepth to remove the banding, lets go with float to make it easier. You can use the function to get the maximum possible data loss from rounding to tv range 8-bit, convert that to a float value, and use it as the cutoff for the smoothing.

clip = depth(clip, 32)

thry = scale_value(0.5, 8, 32, range_in=0, scale_offsets=0) # 0.5/219
thrc = scale_value(0.5, 8, 32, range_in=0, scale_offsets=0, chroma=1) # 0.5/224

smooth = core.bilateralgpu.Bilateral(clip)
final = core.std.Expr([clip, smooth], [f'x y - abs {thr} > x y ?' for thr in (thry, thrc)])

...and if you set scale_offset=1, you'd get like -0.07 for the luma threshold and -0.57 for chroma

The same goes for something like waifu2x which only supports RGB, which like floats is always full range. So there for i.e. RGB48 you'd use something like scale_value(0.5, 8, 16, range_in=0, range=1, scale_offsets=0) which is 0.5 / 219 * 65535 or about 149 (note that with range_in=1 this would be about 128, tv range is smaller so each color step is just slightly larger)

Anyway this sort of std.Expr fuckery was the main reason I had for writing this thing originally, but it ended up being fairly easy to have it support pretty much everything via adding the scale offset and chroma parameters. I don't personally care what the defaults are, so go ahead and change them if you all feel strongly about it.

tl;dr std.Expr is good, you should use it

Asd-g commented 4 years ago

yes, there are many ways to use this function wrong

Ironically you wrote this and it's valid for the above example you wrote.

For integers you have a difference (because of the bit shift) of factor 1.002/3x... between bit depth conversions with full (in and out) and bit depth conversion with limited (in and out). For float that difference has factor 1. For conversion to float with full (in and out) and for conversion to float with limited (in and out) the math is same. (tv values ( tv in and tv out) are the same as full values (full in and full out) but clipped between 0,0627450.. and 0,921568627). If x/219 is correct operation you will have valid values for the entire range.

scale_value(0.5, 8, 32, range_in=0, scale_offsets=0) scale_value(0.5, 8, 16, range_in=0, range=1, scale_offsets=0)

That makes no sense. range_in=0 implies value >= 16. You're using it wrongly and that's the reason you need some random value (x/219) to "fit" to your operation.

tl;dr std.Expr is good, you should use it

Using Expr doesn't mean the operations done with it are correct.

Asd-g commented 4 years ago

It's sadly that there are other people seeing no issue with scale_offsets=0 at all not just because it set by default.

kgrabs commented 4 years ago

range_in=0 implies value >= 16. You're using it wrongly

scale_offsets=0 is not for individual pixel values (levels), it's for relative pixel values (diff clips), usually used within Expr or by Binarize after generating a diff clip with Expr or MakeDiff. If you have two tv range clips and subtract them via std.Expr([a, b], 'x y -') you're gonna get values anywhere in the range of 0 - 219 because the offset disappears (235 - 16 = 219). The same thing applies to calculating based on a threshold (source vs filtered with some sort of cutoff), like my example above. Again, the maximum amount of variance two tv range clips can have is 219 (luma) or 224 (chroma)

and that's the reason you need some random value (x/219) to "fit" to your operation

A float conversion would normally be (x - 16) / 219 . All this thing does is mimic zimg and let you disable the addition/subtraction steps by setting scale_value=0, which makes that just x / 219. So yes, setting scale_offsets=0 for a value of like 235 will return a number greater than 1, which is likely never useful, and likewise setting scale_offsets=1 for a value lower than 16 will return a negative number, also probably useless. They're two separate modes per se, for two very different use cases

It's sadly that there are other people seeing no issue with scale_offsets=0 at all not just because it set by default.

Its just a matter of what you're using the function for. Relative pixelwise values for Expr and thresholds for Binarize will probably want scale_offsets=0. Anything involving specific luma/chroma levels will want scale_offsets=1. The only "problem" is that by supporting everything, the user has more opportunities to use the function wrong

kgrabs commented 4 years ago

If you think scale_offsets=0 is a bad default just open another issue requesting a change, cuz honestly I have no idea which setting is being used more as I have no real way of knowing how ppl are using this thing

I'm all for changing it to True by default if only because the people disabling it clearly know what they're doing and have a very specific reason for why, and that's not necessarily true the other way around. I'm a little worried about people using it for Binarize values so maybe make range(_in) default to 1 (full) as well?

Asd-g commented 4 years ago

Using range_in=0 with values < 16 leads to negative values for the output when range=1. It's wrong usage. Period. You have negative values and you're trying to "fix" them with a random value.

If you have two tv range clips and subtract them via std.Expr([a, b], 'x y -') you're gonna get values anywhere in the range of 0 - 219

You have values between 0-219 because that's not the correct way to subtract two clips (integer bit depth). Expr([a, b], 'x y - 128 +') is the correct way to subtract two clips(preserving the entire range 0-255). It's what also makediff is doing.

If you think scale_offsets=0 is a bad default

scale_offsets=0 is bad at all not just for default. It's a random operation you came up to "fix" the wrong usage of functions in the first place.

kgrabs commented 4 years ago

It's not wrong, thats the logic behind how stuff like LimitFilter's autoscaling "thr" works, as well as how you make range masks (like mt_edge("min/max") and dither_build_gf3_range_mask in Avisynth).

You do clip.std.Maximum() minus clip.std.Minimum() for range masks (clearly the first one will always have a higher value or be the same). The only way to scale a range mask's Binarize threshold to float is with scale_offsets=0. There's also mvsfunc's LimitFilter, which uses values relative to 8 bit to limit the change of a filter to "thr" and then attenuate to source from a change of thr to a change of thr * elast. Using scale_value here allows you to keep the strength the same over all combinations of bit depth, range and sample type.

Asd-g commented 4 years ago

It's not wrong...

What? Using values < 16 for range_in=0 or scaling x/219 for float?

Here how mt_edge is scaling the thresholds to float.

kgrabs commented 4 years ago

im talking about the concept of range masking, not pinterf's value scaler

Function Dither_build_gf3_range_mask (clip src, int radius)
{
    src
    ma  = (radius >  1) ? mt_expand_multi (sw=radius, sh=radius, mode="ellipse") : last
    mi  = (radius >  1) ? mt_inpand_multi (sw=radius, sh=radius, mode="ellipse") : last

    (radius >  1) ? mt_lutxy (ma, mi, "x y -")
\                 : mt_edge (mode="min/max", thY1=0, thY2=255)
}

to reiterate, one tv range clip minus another gives a possible value range of 0-219 for luma, and 0-224 for chroma

if a white and black pixel neighbor on another, an 8 bit range mask will have those two pixels return a value of 219, but if a float clip had black and white pixels they'd be 0 and 1. and so 1 - 0 = 1. and when x = 219, x / 219 = 1

kgrabs commented 4 years ago

i feel like you're just overthinking this, its not about whats valid in tv range. its just a setting that lets you scale a value according to range sizes without adjusting for offsets. Limiting a filtered clip relative to the source is a common enough concept to make it useful, and scale_offsets=0 gives you the control you need where that extra control is actually necessary.

Asd-g commented 4 years ago

to reiterate, one tv range clip minus another gives a possible value range of 0-219 for luma, and 0-224 for chroma

Right, the range size for tv range clip is 219 (219-0=235-16=255-36=219). But range_min of tv range clip is not 0 and range_max is not 219. x/219 should not return 1. That's why using range_in=0 with value < 16 makes no sense.

i feel like you're just overthinking this, its not about whats valid in tv range.

Really? That tv range has defined min and max. Where is documented an alternative value of tv_range_min and tv_range_max other than 16-235 so one should expect range 0-219 for tv range clip? If scaling between bit depths depends only on the range size you can afford omitting offsets. That would lead to the same results for 0-219 and 16-235. Just show me one other tool that expect a tv range with min and max 0-219. Someone can expect even 36-255 min and max for tv range and that's his problem but implementing it in function like this one implies behavior according standards. You tell me I'm using this function wrongly because I'm not expecting a tv_range_min/tv_range_max other than 16-235. Come on... @kageru, since you closed that issue, does that mean you're expecting tv_range_min/tv_range_max 0-219 instead 16-235?

Asd-g commented 4 years ago

Currently the function scale_value has description: Scales a given numeric value between bit depths, sample types, and/or ranges.. Also param range_in: Pixel range of the input value. No clamping is performed. See :class:Range.. Class Range for LIMITED=0 - """Studio (TV) legal range, 16-235 in 8 bits.""". So even according to the function description values < 16 for range_in=0 make no sense and scale_offsets=0 doesn't have place there and there is no real usage in the context of what the function is aiming to do.

If you want to implement something like scale_threshold(or whatever name) you should do it as a separate function because diff clip of tv ranges clips has different min/max than the standard tv range min/max. Again - x/219 is wrong. Diff clip of tv range clips has a range 0-219. That 0-219 should be expanded to 0-255 because float in VapourSynth expects full range. Finally to scale that 0-255 to float you should do x/255. If you want to scale that 0-219 to other integer bit depth just do bit shift. Diff clip of two full range clips has range 0-255. To scale this diff clip to float you're doing x/255. If you want to scale to other integer bit depth do the same math as (range=range_in=1, scale_offsets=1). There is nowhere x/219 for scaling to float.

kgrabs commented 4 years ago

Just show me one other tool that expect a tv range with min and max 0-219

Three use cases wasn't enough, now you need four examples to ignore?

Asd-g commented 4 years ago

Yes, "use case" like this one:

thry = scale_value(0.5, 8, 32, range_in=0, scale_offsets=0) # 0.5/219

It doesn't deserve to be commented because for such threshold in float you need to do simply 0.5/255.