afritz1 / OpenTESArena

Open-source re-implementation of The Elder Scrolls: Arena.
MIT License
915 stars 68 forks source link

direction.isNormalized failure locks game #174

Closed Digital-Monk closed 1 year ago

Digital-Monk commented 4 years ago

0.11.0, built locally with -march=native and -Ofast. Goofing around in lava pit around Jagar Tharn, enjoying the modern interface mouselook system. Normally would have chalked this up to rendering thread race/lock, but I'm running Very Low threading (single, I believe) and resolution scale 0.25 to compensate.

After I've been running around peering in various places for awhile, the screen locks up. The console where I ran it reports:

[Entities/Player.cpp(344)] Error: Assertion failed: "direction.isNormalized()"

This is the third time in two days this has happened. Not sure what's up -- I'm not looking especially far up or down or anything. Slow accumulated error creep in modified quaternion? I know normalization can be expensive, but I also know it's generally a good idea to re-normalize after ever transform, because there's always a slight and unavoidable rounding errror...

afritz1 commented 4 years ago

I've seen this bug happen a couple times when testing on Linux. I think it hits the assertion because NaNs get into the camera rotation values. I can look at it again sometime soon.

Digital-Monk commented 4 years ago

Yep... [-NaN, -NaN, -NaN] Will see if I can trace down, since it seems to be hitting me more frequently ;)

Digital-Monk commented 4 years ago

Heh. Well, I'm not sure exactly how it happens, but I can recreate it at will. In the modern interface, press left and right at the same time.

My guess is that it adds the left vector to accelDirection and the right vector to accelDirection, gets a zero-length vector, tries to normalize that which results in dividing all three elements by 0, and down we go.

Digital-Monk commented 4 years ago

I tried to fix this by fixing Vector2's and Vector3's normalized() function to notice if lenRecip was not finite, and to return Vector_::Zero in that case. Didn't help.

So, I brute forced it: In GameWorldPanel.cpp vaguely around line 1300 there are two blocks that read the WASD scancodes to get directional booleans. After each of those blocks, I added:

        if (forward && backward)
        {
            forward = false;
            backward = false;
        }
        if (left && right)
        {
            left = false;
            right = false;
        }

So that pressing opposing movement keys gets cancelled out immediately instead of relying on complicated or borderline math cases.

afritz1 commented 4 years ago

I think I would try using std::isfinite() or related functions instead, since I do that in other places.

afritz1 commented 4 years ago

It is a small pet peeve of mine, ever since I started PC gaming, that most games favor one axis over the other, like W over S and A over D for no reason other than that's the order the programmer wrote the if-else's in. I'm trying to solve that problem here by allowing both buttons to be pressed at the same time and handling whatever happens in that case. Instead of saying "neither buttons are pressed", I want to let the math happen and then sanitize the math if there are issues with it.

afritz1 commented 4 years ago

It's a little weird this crash is happening because it means that, with GCC's fast math, this expression is true:

std::isfinite(accelDirection.length())

But this one is false:

direction.isNormalized()

Maybe the conditions at line 1315 and 1382 should be changed from std::isfinite() to accelDirection.isNormalized().

Edit: the main reason I don't want to set the input booleans to false is because NaNs/etc. might still somehow get into accelDirection even when only pressing one input. So we might as well do the math check -- either std::isfinite() or isNormalized(). In this case, it looks like it should be isNormalized().

Digital-Monk commented 4 years ago
std::isfinite(accelDirection.length())

will be true for a 0-length vector, but naively normalizing it will divide-by-zero and spit out NaNs.

I agree that fixing it system-wide is preferable, so that any source of broken math gets captured and handled sanely. Unfortunately...

I tried to fix this by fixing Vector2's and Vector3's normalized() function to notice if lenRecip was not finite, and to return Vector_::Zero in that case. Didn't help.

OK, I should have clarified that. Inside normalized(), after lenRecip is calculated, I checked std::isfinite(lenRecip) so that if it had gone crazy (say, because length was ~0 and we got an infinity, or anything else led to a NaN) then normalized() would return Zero. That approach didn't fix the problem, and I'm not sure why.

That's why I just fixed it at the input stage. There is no preference for one axis over another or one half-axis over another -- if exactly opposing inputs are given, both are cancelled. It is conceptually equivalent to letting the math cancel out.

If you wanted to handle this mathematically instead of with logic, you could do something like this (avoiding conditionals requires reliance on bool->int casting hackery, but if you find a C++ compiler today that doesn't use 0 for false and 1 for true, I'd be astounded):

int groundAccel = static_cast<int>(inputManager.keyIsDown(SDL_SCANCODE_W))
                - static_cast<int>(inputManager.keyIsDown(SDL_SCANCODE_S));
int rightAccel = static_cast<int>(inputManager.keyIsDown(SDL_SCANCODE_D))
               - static_cast<int>(inputManager.keyIsDown(SDL_SCANCODE_A));

groundAccel will now be -1, 0, or 1 (backwards, nothing, forwards) and similarly for rightAccel. So:

accelDirection = (groundAccel * groundDirection3D)
               + (rightAccel * rightDirection);

That keeps the cancellation math in integers where it works, instead of in floating point where it is notorious for not-quite-working (especially when using fast math).

On the whole, though, gathering user input occurs so infrequently that it is silly of me to be looking at performance optimization of user input, and clear logic statements of intent would be better -- hence why I just put the direct opposing action cancellation logic in place.

afritz1 commented 1 year ago

Don't think this is a problem anymore.