RandyGaul / qu3e

Lightweight and Simple 3D Open Source Physics Engine in C++
zlib License
928 stars 111 forks source link

Crash With Tree Balance. #20

Closed PhilCK closed 8 years ago

PhilCK commented 8 years ago

After a few hours play of automated play I get a bad access crash in q3DynamicAABBTree::Balance

i32 iB = A->left; // Has value 339247232
i32 iC = A->right; // Has value 292
Node *B = m_nodes + iB;
Node *C = m_nodes + iC;

i32 balance = C->height - B->height;

On OSX but don't think that makes a difference here. The game / physics are quite solid, although I do somtimes get tunnleling, although that is likely due to framerate of debug build not being as it should, rather than anything.

Update Happens on Windows, trying to confirm its not me doing something dumb, right now.

PhilCK commented 8 years ago

I don't know if its important, but the AABB looks quite small.

screen shot 2016-08-27 at 08 51 30

RandyGaul commented 8 years ago

Hmm this is pretty interesting. Could be a logic error, or some kind of edge case bug involving unitialized memory/heap corruption. What exactly does your simulation do when running for a few hours? Balance happens upon node movement (removal or insertion), and these movements can be triggered by creating, deleting, or moving rigid bodies around in the world.

It does look very bad that the AABBs are so tiny. I'm wondering how this happened... Are you creating very tiny shapes? Also did you redefine r32 to be double point precision? I'm wondering why there are so many digits of precision in your debug view (I'm assuming that's xcode view). Also, are you compiling in 32-bit or 64 bit mode?

RandyGaul commented 8 years ago

Oh and would you be able to post a screenshot of the call stack upon crash, just for completeness?

PhilCK commented 8 years ago

There's alot happening in the simulation, I originally had some custom code doing a simple AABB sweep prune, but since the core code is shared I'm using the same code for a very simple fps, net result is this might not quite be the right solution, i'm pretty much using qu3e for the collision part and dealing with response manually. Heres a video of whats going on.

Box Def box_def

For physicsal entities game world ranges from +40 to -90 in z-axis, -7 to +7 in the x/y-axis, most objects are unit cubes, Verified using the qu3e's debug renderer on none are smaller that.

XCode is 64bit, but MSVS is 32bit. I can get the same callstack, but I get another both from Step() (sounds very much like a corruption). I don't think the corruption is my fault but trying to validate that, I allocate all my memory upfront and work with-in those bounds, the only heap allocations done after that are to do with physics and fonts.

CallStack callstack

PhilCK commented 8 years ago

I'll see if I can repurpose one of the demo apps and try and repo after I'm certain I'm not doing something wrong with allocations on in my part.

PhilCK commented 8 years ago

Not redefining r32 to be double, unsure why the debugger is like that.

RandyGaul commented 8 years ago

Okay, well if you can ever get a repro please let me know. For example, if you can get a qu3e dump (scene->Dump( fp )) then I could likely solve this problem. The idea is to capture a dump just before the crash, such that I can load up the dump and see the crash occur moments later. Either this, or repro steps so I can create my own dump file would be the next step.

PhilCK commented 8 years ago

Actively working on this, but in the middle of moving house so since its takes me at least a few hours to repo in my own app I'm trying to repo then run through out the night, slow progress. I'm littering the code with debug checks (as conditional breakpoints seem to be very expensive) right now, hopefully that'll pry some information out.

PhilCK commented 8 years ago

I've not forgotten about this issue. I've been unable to make it crash in your test app. I recreated the test app cases in the my game engine, and they run for hours and hours and hours no issue, so something about this particular use case.

RandyGaul commented 8 years ago

Sounds to me like heap corruption. You may be overwriting a pointer somewhere. Are you using the userdata in qu3e at all? Perhaps an incorrect typecast somewhere? Just some ideas. If you make any more progress on this feel free to open this issue back up.

PhilCK commented 8 years ago

Thanks @RandyGaul, it does sound like heap corruption but I haven't been able to trace it yet. Yes I use the userdata, but I use it to store instance_id's rather than objects. Will re-open if I can repo this in the test app.

RandyGaul commented 8 years ago

Are you building in 32 bit and typecasting the void* to int64? This could overwrite 4 bytes. Just guessing!