Open foosoftsrlold opened 2 years ago
How does this change the numerical result? What is the savings?
Hi Kile,
That epsilon is numerically irrelevant. So irrelevant that I really doubt that any retraining is needed.
g is used only when sigma1_sq is > 2, epsilon is 6.5e-4.... it is a really tiny fraction.
double g = sigma12 / (sigma1_sq + eps);
In normal cases... sigma1_sq is definitely larger (it is the filtered variance of reference frames). On a simple test sequence, VMAF simply was not impacted. I would be surprised if VMAF change were over 1e-6 for non degenerate cases (i.e. all black)
This is the best I came up with, trying to avoid floating point as much as possible
sigma2_sq = MAX(sigma2_sq, 0);
if (sigma1_sq >= sigma_nsq) {
uint32_t log_den_stage1 = (uint32_t)(sigma_nsq + sigma1_sq);
int x;
uint16_t log_den1 = get_best16_from32(log_den_stage1, &x);
/**
* log values are taken from the look-up table generated by
* log_generate() function which is called in integer_combo_threadfunc
* den_val in float is log2(1 + sigma1_sq/2)
* here it is converted to equivalent of log2(2+sigma1_sq) - log2(2) i.e log2(2*65536+sigma1_sq) - 17
* multiplied by 2048 as log_value = log2(i)*2048 i=16384 to 65535 generated using log_value
* x because best 16 bits are taken
*/
accum_x += x + 17;
den_val = log2_table[log_den1];
// this check can go away, things will work anyway
if (sigma12 > 0) {
// num_val = log2f(1.0f + (g * g * sigma1_sq) / (sv_sq + sigma_nsq));
/**
* In floating-point numerator = log2((1.0f + (g * g * sigma1_sq)/(sv_sq + sigma_nsq))
*
* In Fixed-point the above is converted to
* numerator = log2((sv_sq + sigma_nsq)+(g * g * sigma1_sq))- log2(sv_sq + sigma_nsq)
*/
const double eps = 65536 * 1.0e-10;
double g = sigma12 / (sigma1_sq + eps); // this epsilon can go away
int32_t sv_sq = sigma2_sq - g * sigma12;
if (sigma2_sq == 0) { // this was... sigma2_sq < eps
g = 0.0;
}
sv_sq = (uint32_t)(MAX(sv_sq, 0));
g = MIN(g, vif_enhn_gain_limit);
int x1, x2;
uint32_t numer1 = (sv_sq + sigma_nsq);
int64_t numer1_tmp = (int64_t)((g * g * sigma1_sq)) + numer1; //numerator
uint16_t numlog = get_best16_from64((uint64_t)numer1_tmp, &x1);
uint16_t denlog = get_best16_from64((uint64_t)numer1, &x2);
accum_x2 += (x2 - x1);
num_val = log2_table[numlog] - log2_table[denlog];
accum_num_log += num_val;
accum_den_log += den_val;
}
else {
//accum_num_log += 0;
accum_den_log += den_val;
}
}
else {
accum_num_non_log += sigma2_sq;
accum_den_non_log++;
}
Regarding igv proposal, I'm sure that it's numerically ok, and it's possibly more consistent with float vif. But I humbly believe that the refactoring I propose is in the end more effective, especially if SIMD parallelization is desired (for Intel... I think that AVX512 is required)
In this code:
adding eps(ilon) seems not that useful, as g is used only when sigma1_sq > 2
The division by zero is easily avoided moving g and sv_sq computation inside the if (sigma1_sq >= sigma_nsq) branch