Make UnitVec initializing constructor public

opera-aberglund commented 7 months ago

Hello, it's me again!

I noticed I was having some trouble with my unit vectors not having the exact values I wanted after I had created them with UnitVec::Get. I assume there must be some floating point errors in the calculations in this function?

I thought it would be nice to just make the initializing constructor public

/// @brief Initializing constructor.
constexpr UnitVec(value_type x, value_type y) noexcept : m_elems{x, y}
{
    // Intentionally empty.
}

but perhaps add some assertions that the values are between 0 and 1?

opera-aberglund commented 7 months ago

Now I'm also wondering in general if this function (UnitVec::Get) could be considered "unsafe" to use in deterministic simulations? Would it be theoretically possible for it to generate different unit vectors from the same arguments?

louis-langholtz commented 7 months ago

Hi @opera-aberglund !

The constructor you're referencing is what I've been calling an "initializing" constructor; not the default constructor. What I've been calling the "default" constructor is the constructor that can take no arguments and is currently declared as:

constexpr UnitVec() noexcept = default;

So, I changed the issue's title to match the names I'm using.

Incidentally, cppreference.com states: "A default constructor is a constructor which can be called with no arguments".

louis-langholtz commented 7 months ago

The function interface it seems like you want is one which allows you, the user, to set the X and Y elements of the 2-D unit vector.

Problem is, that only values of X and Y, whose magnitude of the line whose representative point is that X and Y and the origin of 0 and 0 is 1, are valid values. That's the invariant of the class. It's essentially also the invariant of the sin and cos functions when taken together for any angle within the domain of angles they accept.

Now I see that this invariant is not explicitly stated. As a matter of practice/convention, it should be stated as a Doxygen-like comment with the invariant tag/command. Sorry about that.

So allowing the user to directly enter the X and Y values of the UnitVec class would not protect that invariant requirement. The alternative, that is provided to the user, is the Get function which you mentioned.

Is the Get function not resulting in a UnitVec value whose X and Y element values match what you're expecting?

louis-langholtz commented 7 months ago

Now I'm also wondering in general if this function (UnitVec::Get) could be considered "unsafe" to use in deterministic simulations? Would it be theoretically possible for it to generate different unit vectors from the same arguments?

The Get function should always result in the same values for the same arguments. It's fully deterministic. At least for a given hardware platform.

Across different hardware however, it's possible that it might generate slightly different values. That's true of all IEEE-754 floating point math operations - all of them. Even setting a floating point value like 0.3.

Now I'm thinking that this is what you're running into. That's not technically "non-determinism" but is often never-the-less referred to as such.

Setting X and Y directly wouldn't solve this problem. While it might mitigate it, it will do nothing to deal with sin, cos, sqrt, addition, multiplication, division, or really any floating point operation used everywhere else in the library. Even setting a floating point value that can't be exactly represented may result in slightly different values across different platforms unfortunately.

The only absolute solution for this is to require either:

The exactly same hardware.
Fixed point math.

opera-aberglund commented 7 months ago

So what was happening was that I was working with serialization and deserialization again and found that the value of a unit vector wasn't exactly the same as the one it was deserialized from so the equality check failed.

What I was doing was to break down the unit vector into an array of two floats before serializing, and on deserialization I tried to reconstruct the init vector by calling UnitVec::Get on the two floats. This created a unit vector with slightly different values than the values I entered as arguments.

So what I was expecting to happen was if I entered a valid unit vector as arguments to UnitVec::Get, it would behave as the identity function and return me back the same values, but contained in a UnitVec object.

I can try to reproduce it and show some exact values where it happened for me.

EDIT: This also happened on the same machine when I was testing, so it was exactly the same hardware running the calculations.

louis-langholtz commented 7 months ago

What I was doing was to break down the unit vector into an array of two floats before serializing, and on deserialization I tried to reconstruct the init vector by calling UnitVec::Get on the two floats. This created a unit vector with slightly different values than the values I entered as arguments.

Oh, that's a relief to me. That's a more practically solvable problem IMO. So the gist is: Get is not fully reversible.

louis-langholtz commented 7 months ago

As an initial concept/idea, this may be solvable splitting Get into two separate functions:

A faster code path. That's less reversible.
A slower and more accurate code path. That's always reversible.

For serialization/deserialization then, could you use the slower path? I'd guess that'd be okay since I wouldn't imagine that'd be in as tight a loop of your code.

opera-aberglund commented 7 months ago

I think a slower would be fine, there are not that many calls to it, mainly just for a few of the joints that have internal unit vectors that needs to be serialized.

louis-langholtz commented 7 months ago

I've written some unit test code now that checks the reversibility of the UnitVec::Get x y function.

Turns out that for Real of float or double, the existing function is reversible on my hardware within 2 ULPs for all values I tried. And, even using the hypot path of code, I could not get this down below 1 ULP.

When you're comparing for "exactness", I suspect you're using ==. I'm not of the camp that == is never appropriate for floating point values, but I do agree one has to be extra careful using it in that context. For a deep dive on the issue, see Comparing Floating Point Numbers, 2012 Edition.

So for example, the include file Math.hpp includes the AlmostEqual function. In terms of reversibility, not all values I tried compared equally. But, they did all compare within 2 or 1 ULPs as detailed above.

The significance of this is that the error introduced from serialization/deserialization should also be constrained within those ULPs as well. Assuming the serialization/deserialization library you're using doesn't itself introduce imprecision.

Can your code live with 1 or 2 ULPs of imprecision? I don't know the answer to that but I'll help however I can. Speaking of which, I'll try to merge the unit test code addition in so you can see the actual code I'm using to test this.

opera-aberglund commented 7 months ago

That's interesting! I just ran another test and here's an example of what I got:

Initial value (from the u value of a distance joint): (0.774356544, -0.632749438). After calling UnitVec::Get: (0.774356663, -0.632749497).

I guess this is within 2 and 3 ULPs? Interestingly, calling UnitVec::Get again recursively gave the exact original values from u.

The serialization library (Cista) I'm using doesn't itself introduce any imprecision. That's a valid question though about the significance, and it's hard to tell if this can lead to de-sync issues in practice or not, but I'd like to minimize the risk as much as possible of course.

I'm currently using my own fork of PlayRho where I've made the initialization constructor public just so I can use it in deserialization. What is your opinion on this? I totally understand why it's private right now because you want to protect the invariant, but what if my serialization library could be considered a "friend" class that has special access?

I'm hoping I can get it open sourced soon as well! Would be really nice to get some more feedback from you on how I handle the objects, and perhaps we can work more closely together in the future.

louis-langholtz commented 7 months ago

I think there's a space for an interface behavior change that's more apt to take the values without change. I've reopened this issue to try and support that and let you know more when that's available.

louis-langholtz commented 7 months ago

Interestingly, calling UnitVec::Get again recursively gave the exact original values from u.

Ideally IMO, ~~Get~~UnitVec(const underlying_type& v) would always do that.

louis-langholtz commented 7 months ago

@opera-aberglund PR #577 provides a constructor a bit like you were asking for. It expects the value as a Vec2, expects that its magnitude is within 2 ULPs of 1.0f, and will throw an exception if the magnitude is not. This way the invariant of UnitVec is enforced while providing a way to exactly deserialize a UnitVec. This essentially works via the recognition that std::cos and std::sin always provide a resulting vector with a magnitude that is within 2 ULPs of 1.0f. So unitVec.GetX() and unitVec.GetY() in turn do and recreating unitVec via this new constructor is exact.

A difference from what you've mentioned is that the original constructor, that does no checks and has that's been private, now - internally - requires the passage of the object of an internal type (let's call this "tagged dispatch") that only member functions can use.

louis-langholtz commented 7 months ago

I'm hoping I can get it open sourced soon as well!

That sounds exciting!

Would be really nice to get some more feedback from you on how I handle the objects,

Happy to provide feedback. Not certain about what you'd like to get feedback on though.

Do you mean on the discussion (under Discussions) on maintaining the same object IDs? Or something else like on the code you're hoping to open source?

and perhaps we can work more closely together in the future.

I'd be happy to do that also.

I guess this is within 2 and 3 ULPs?

A way I like determining that is by using playrho::AlmostEqual. That just uses std::nextafter under the hood. It applies std::nextafter up to X number of times, where X is given by the ulp parameter, and early out occurs if equality is had. See TEST(UnitVec, GetCosSinIsReversibleWithTwoUlps) for an example use case.

There's other ways of doing it too of course, like type punning methods. I'm not as fond of those however. I suspect for more than an ULP or two, the union-using difference type punning method would have to be faster than iteratively calling std::nextafter. Then again, I haven't looked at the assembly the compiler generates in either of these case and haven't compared these methods using the benchmark framework either.

louis-langholtz commented 7 months ago

Assuming the merged changes suffice. Closing this issue then. Please re-open this if it doesn't help enough or open a new issue.

louis-langholtz / PlayRho

Make UnitVec initializing constructor public #575