yoanlcq / vek

Generic 2D-3D math swiss army knife for game engines, with SIMD support and focus on convenience.
https://docs.rs/vek
Apache License 2.0
282 stars 32 forks source link

Fix `is_normalized` sometimes returning false for normalized vectors #48

Closed Imberflur closed 4 years ago

Imberflur commented 4 years ago

Calling is_normalized on normalized vecs can return false
I suspect this is due to using magnitude_squared and it seems like the most efficient method to solve this is doubling the epsilon. I implemented that and it has worked for me so far.

[src/main.rs:12] vek::Vec3::<f32>::new(-30.988623, -19.13624, 17.503609) = Vec3 {
    x: -30.988623,
    y: -19.13624,
    z: 17.503609,
}
[src/main.rs:13] unnormalized.normalized() = Vec3 {
    x: -0.766879,
    y: -0.47356677,
    z: 0.43316385,
}
[src/main.rs:14] normalized.is_normalized() = false
[src/main.rs:15] normalized.magnitude() = 0.9999999
[src/main.rs:16] normalized.magnitude() + std::f32::EPSILON = 1.0
[src/main.rs:17] normalized.magnitude_squared() = 0.99999976
[src/main.rs:18] normalized.magnitude_squared() + std::f32::EPSILON * 2.0 = 1.0
Imberflur commented 4 years ago

I found a case where my solution does not work

    let unnormalized: vek::Vec3<f32> = (0.0067734555, 0.016385008, 0.0053294576).into();
    dbg!(unnormalized);
    let normalized = unnormalized.normalized();
    dbg!(normalized);
    dbg!(normalized.is_normalized());
    dbg!(normalized.magnitude());
    dbg!(normalized.magnitude() + std::f32::EPSILON);
    dbg!(normalized.magnitude_squared());
    dbg!(normalized.magnitude_squared() + std::f32::EPSILON + std::f32::EPSILON);

outputs

[src/main.rs:22] unnormalized = Vec3 {
    x: 0.0067734555,
    y: 0.016385008,
    z: 0.0053294576,
}
[src/main.rs:24] normalized = Vec3 {
    x: 0.36586484,
    y: 0.88502806,
    z: 0.28786802,
}
[src/main.rs:25] normalized.is_normalized() = false
[src/main.rs:26] normalized.magnitude() = 0.9999998
[src/main.rs:27] normalized.magnitude() + std::f32::EPSILON = 0.99999994
[src/main.rs:28] normalized.magnitude_squared() = 0.9999997
[src/main.rs:29] normalized.magnitude_squared() + std::f32::EPSILON + std::f32::EPSILON = 0.99999994
Imberflur commented 4 years ago

Adding another epsilon (for a total of three) fixes this I did more extensive testing by running the loop below for a few minutes (I also tested with Vec4)

    loop {
        let v = Vec3::<f32>::new(rng.gen(), rng.gen(), rng.gen());
        if let Some(n) = v.try_normalized() {
            if !n.is_normalized() {
                panic!("{} => {}", v, n);
            }
        }
    }

I would be interested to know whether this is an acceptable solution :sweat_smile:

yoanlcq commented 4 years ago

For reference, GLM does it like so

abs(length(v) - 1) <= 2 * epsilon

Edit: Unfortunately I have made at lot of changes in the code today, due to updating to the 2018 edition, hence the newly introduced conflicts; My apologies...

yoanlcq commented 4 years ago

If multiplying the epsilon by 3 solves this, then I would say it's acceptable, and it would probably be better than the current implementation anyway...

My issue is that there is no "theoretical" foundation beneath this finding. It's a hack that works, but a hack still.

If we started from GLM's implementation in GTX_vector_query, but wanted to avoid the square root like in my original implementation, I guess it would go like so (step by step) :

abs(length(v) - 1) <= 2 * epsilon;
length(v) - 1 >= -2 * epsilon && length(v) - 1 <= 2 * epsilon;
length(v) >= 1 - 2 * epsilon && length(v) <= 1 + 2 * epsilon;
length(v)*length(v) >= length(v) * (1 - 2 * epsilon) && length(v)*length(v) <= length(v) * (1 + 2 * epsilon);

Which proves that... Well have to compute length(v) at some point, so we can't avoid the square root. Whoops.

What I think we should do is just copy from GLM's implementation, which in our code could be written as (it's very close to yours, except we don't use magnitude_squared) :

// Either this one using relative_eq as usual
self.magnitude().relative_eq(
                    &T::one(),
                    T::default_epsilon() + T::default_epsilon(),
                    T::default_max_relative() + T::default_max_relative(),
                )
// ... Or this one using abs_diff_eq, the equivalent GLM's implementation
self.magnitude().abs_diff_eq(
                    &T::one(),
                    T::default_epsilon() + T::default_epsilon(),
                )
yoanlcq commented 4 years ago

While we're at it, please change the implementation of is_approx_zero; it should use the exact same test as is_normalized, but testing for equality with 0 instead of 1. Thanks!

Imberflur commented 4 years ago

This could potentially be solved by multiplying by the extreme bounds on length

length(v) >= 1 - 2 * epsilon && length(v) <= 1 + 2 * epsilon;
length(v)*length(v) >= (1 - 2 * epsilon)*(1 - 2 * epsilon) && length(v)*length(v) <= (1 + 2 * epsilon)*(1 + 2 * epsilon)
length(v)*length(v) >= (1 - 4 * epsilon + 4 * epsilon * epsilon) && length(v)*length(v) <= (1 + 4 * epsilon + 4 * epsilon * epsilon)

However, since 3 * epsilon appears to work I suspect that the true epsilon needed in the original is somewhere between epsilon and 2 * epsilon. Alas, I have no idea where to begin trying to actually prove this and I will admit my original solution is very much a hack.

The solution you presented feels more correct than using my trial-and-error value and cleaner than using 4 or 5 epsilons so I'm inclined to take the cost of a square root for now.

yoanlcq commented 4 years ago

It's as you wish, I'm fine with either of these solutions.
(I'm kind of relying on you to test with your original RNG program designed to run for a while, then if the results are good, then it's good.)

For fun I guess you could try using the length(v)*length(v) >= (1 - 4 * epsilon + 4 * epsilon * epsilon) && length(v)*length(v) <= (1 + 4 * epsilon + 4 * epsilon * epsilon) condition. With some variables it would be less ugly, and very probably faster than the square root. In fact I guess it would be possible to precompute (1 + 4 * epsilon + 4 * epsilon * epsilon) manually but that looks bothersome, especially since it's a constant expression the compiler is expected to optimize anyway.
I you can get it to work with your test program, I would actually favor this one. It could be factorized into a is_magnitude_close_to(x) method which would be used by both is_approx_zero and is_normalized.

Imberflur commented 4 years ago

I will try that out!

Imberflur commented 4 years ago

For fun I found all the unique magnitude squared values that are generated after testing 10^9 random f32 Vec3s

    1065353210: 0.99999964,
    1065353211: 0.9999997,
    1065353212: 0.99999976,
    1065353213: 0.9999998,
    1065353214: 0.9999999,
    1065353215: 0.99999994,
    1065353216: 1.0,
    1065353217: 1.0000001,
    1065353218: 1.0000002,
    1065353219: 1.0000004,

Vec4s can produce a couple more

    1065353208: 0.9999995, (had to test 10^10 to find this one)
    1065353209: 0.9999996,
    1065353220: 1.0000005,
Imberflur commented 4 years ago

So I discovered that 3 epsilons is actually not enough with vec4 using f32. I think I forgot to use the --release flag in my test before so it wasn't exploring enough cases to find the issue. So I made sure to test with 4 * epsilon + 4 * epsilon * epsilon more thoroughly.

Using Vec4<f32> I tested it for 10^11 random vectors which took about ~50 min. I realized that 4 * epsilon + 4 * epsilon * epsilon rounds to 4 * epsilon in f32 so I switched to that.

Then I tested with Vec4<f64> for 10^10 random vectors (~10 min). 4 * epsilon + 4 * epsilon * epsilon is actually not the same in f64

0.0000000000000008881784197001252
0.0000000000000008881784197001254

but I didn't find any cases where this fails.

I guess the question is should I round up to 5 epsilons to be safe?

FWIW I am happy with using 4.

yoanlcq commented 4 years ago

It looks like 4epsilon is good... And it appears* good since it's a power of two or something. I would go for that.

If it becomes a critical issue for some users, they can always choose to take the length of the vector and decide for themselves.

A "nice" compromise, IMO, would be to do like lerp and lerp_unclamped, like there could be a is_normalized_fast() which avoids the square root and uses 4*epsilon, and is_normalized which just bites the bullet by copying straight from GLM. Though, that might be overkill...

In any case, I think this is pretty good as it is. Thank you very much for the effort you've put into this seemingly "simple" function. I'll merge this, if you think there is room for improvement feel free to open another PR.

Tomorrow I'll try to do a quick review of the remaining issues before publishing version 0.10.0 which would expose the recent improvements.

Imberflur commented 4 days ago

I just came back to look at this and noticed that it seems like is_approx_zero probably doesn't need as much leniency as is_normalized. At least for the use of is_approx_zero in try_normalized. E.g. something with a length of f32::EPSILON can be normalized. :thinking: