godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
91.1k stars 21.18k forks source link

Game freezes and causes memory leak when changing spatial node's scale #38142

Closed NesterGlover closed 4 years ago

NesterGlover commented 4 years ago

Godot version: Godot 3.2.1

OS/device including version: Windows 10 Home 64 bit

Issue description: I wanted to make a simple animation through code for spike obstacle in my game, where spikes's scale would change using custom easing function. How it looks in GDScript:

func _physics_process(delta):
    # ... some logic code ...
    var s = _ease_out_back(t)
    _mesh.scale = Vector3.ONE * s
    # the value of t is in range from 0 to 1, _mesh is a reference to MeshInstance node

func _ease_out_back(t):
    t = clamp(t, 0, 1)
    t -= 1
    return 8.7*t*t*t + 7.7*t*t + 1

However, the moment this code is executed, the game suddenly freezes and quickly starts consuming a lot of RAM to the point my OS becomes very unresponsive.

Steps to reproduce: Open minimal reproduction project and run. The game will freeze at boot splash screen and will start consuming RAM, so please be careful and close the scene from the editor after several seconds. The scene named Sphere contains code from example above, and I noticed that the game runs fine if Sphere's translation is all zeros and the problem only occurs when translation has certain non-zero value (in the project it is (0, -1.592, 0)).

Minimal reproduction project: minimum_reprod_project.zip

qarmin commented 4 years ago

There is very deep recurrence in Octree::_insert_element Zrzut ekranu z 2020-04-23 18-17-36

NesterGlover commented 4 years ago

So, looks like zero scale of the mesh and its bounding box is causing continuous insertion into octree? Interesting, well, the possible workaround would be then to avoid pure zero values by clamping minimum return value of the easing function like so:

func _ease_out_back(t):
    t = clamp(t, 0.001, 1)
    t -= 1
    return 8.7*t*t*t + 7.7*t*t + 1

Just tested this solution and it seems to work

briansemrau commented 4 years ago

I am experiencing this in my own project, though I haven't been able to find the exact cause within my scene (lots of changing ImmediateGeometry nodes, procedural terrain, imported meshes). When dealing with lots of dynamic geometry, I don't want to rely on workarounds to avoid major memory leaks.

Running my project with the VS debugger (latest 3.2 build as of writing) shows me the exact issue qarmin shows. It's trying to insert into an octree an element with an AABB with a very small size (largest length < ~1e-11) and ~never halting recursion (see code). Comparing 3.2 and master, the offending code appears to be unchanged, unless 4.0 happens to now have checks preventing small AABBs.~ I'll see if I can reproduce the issue there.

I'm willing to implement a bugfix for both 3.2 and master, but I would need advice on how to properly fix this. Should we use AABB.grow_by(...) before inserting it into the octree? Should the octree have a maximum depth?

Edit: It appears the recursion isn't endless and stops no further than ~50 deep (as expected for the octree to reach ~1e-15 octant size). However, I'm noticing that p_element.octant_owners is aquiring millions of entries. Hopefully I can find what's really causing the problem soon.

briansemrau commented 4 years ago

Reproduced in master after recreating the minimal reproduction project

Calinou commented 4 years ago

cc @reduz, as he implemented the octree.

briansemrau commented 4 years ago

This appears to be a floating point precision issue. In my testing, setting the scale to 0.0 doesn't explode the octree, but the ease function in the minimal reproduction actually produces a value of 9e-16. The only way I can see that one element is being assigned to millions of octants upon insertion (as I saw in my testing) is that tiny octant AABBs under a certain size are rounding to equivalent values, especially because reproduction conditions require a nonzero geometry translation.

Solution proposal: In my testing this is fixed by ending recursion when octant size < 1e-9 (arbitrarily), but for a fix I suggest we have that number as high as reasonably possible to avoid errors far from the origin, say 1e-4. Because I'm unfamiliar with octree frustum culling (I assume?), I'd like input whether that number is appropriate, and if it could be larger. However, this won't fix Basis::invert yelling at you for having a zero determinant. But that's already a separate issue.

Unnecessary calculations (for fun): I estimate that the octree can safely handle a maximum recursion depth of ~24 assuming a maximum translation of 1. To reach the geometry AABB size of 9e-16, the octree recurses 49 deep. This means that in the minimal reproduction, the octree insertion will only be complete when at least 8^25 bytes, or 10YB of RAM is consumed.

lawnjelly commented 4 years ago

The correct solution to bugs with zero / near scale scale is to test and fix all bugs, not to avoid these scale values.

Presumably items that are small enough should not be in the octree, but failing that a minimum size would probably work (versus a constant expansion). The minimum size should probably actually be quite large in order to prevent too deep an octree (the octree is itself presumably an optimization, but there will be diminishing returns).

lawnjelly commented 4 years ago

Just tested and as expected, this is fixed by #41123 (at least with balance values above 0.0, I dare not try 0.0 as it crashed my OS when I tried it before the fix! :grin: ).

Noteworthy though that this minimum reproduction project only tests the visual server octree. The same bug could occur in other octrees, at present the visibility notifier octree and the godot physics. These are not currently fixed, but now we know what is causing it we can either swap these to use the new octree or add a hard limit to octree depth.

akien-mga commented 4 years ago

Fixed by #41123.