Closed BigUglySpider closed 2 years ago
This is most likely an issue with cache locality as suggested within initial post.
When converting to radians outside of the function and thus using radian input, its speed once again matches as if we were just passing the data directly and indicating it is in degrees.
This shows that the problem is not necessarily with Quaternion
, but instead likely with the test due to the placement of values in the cache whose copies are being omitted by the release compiler.
There's not much that can really be done about this for the time being due to priorities. However, this should be kept in mind as this is not a final answer; reviewing disassembly of the release Quaternion
from euler conversion function should be considered at some point.
Nonetheless, this issue is being closed for now.
Notably, this issue does not occur on the new testing hardware (AMD Ryzen 9 5950X) - May be an Intel thing, may be an i5-specific thing, may be an i5-8400-specific thing, not too sure)
Unlikely to be an issue of a smaller cache previously since the bandwidth in this operation is nowhere near large enough for that to be a likelihood
In speed tests as of commit f473c9f0854951092364b32640642d09176461b3, an unusual property has been found with the scalar
Quaternion
performance in release builds:FastQuaternion
(well done, compiler!)This is especially unusual given two things:
static_cast
s to local members.