dimforge / nphysics

2 and 3-dimensional rigid body physics engine for Rust.
https://nphysics.org
Apache License 2.0
1.62k stars 121 forks source link

[Critical] Memory leak in 0.9.1 #136

Closed n0uk closed 6 years ago

n0uk commented 6 years ago

Hi!

Sample code: https://github.com/n0uk/nphysics-0_9_1-memleak

It looks like adding and removing bodies produce a memory leak.

n0uk commented 6 years ago

After some investigation i found ncollide::utils::IdAllocator::free() method is never called in narrow phase, so IdAllocator.generator grows infinitely.

sebcrozet commented 6 years ago

Thank you for reporting this and identifying the source of the issue! This will be fixed on ncollide. Unfortunately, fixing this will require a small breaking change (adding a method to the ContactManifoldGenerator trait).

n0uk commented 6 years ago

I'm pull on memleak branch of ncollide, but it's still leaking.

I think issue here: default_narrow_phase.rs:

                    // Proximity stopped.
                    if let Some(mut detector) = self.contact_generators.remove(&key) {
                        // Register a collision lost event if there was a contact.
                        if detector.num_contacts() != 0 {
                            contact_events.push(ContactEvent::Stopped(co1.handle(), co2.handle()));
                            detector.clear(&mut self.id_alloc);
                        }
                    }

I'm not going deeper, but sometimes num_contacts() is zero, but here are cached contacts and if i move call off detector.clear out of condition, all works fine.

Sorry for my English :)

sebcrozet commented 6 years ago

You are very right! Thank you, I've applied the change you suggest.

Since this is a breaking change, I'd like to postpone the release of the patched version of ncollide (to wait for other breaking change I intend to make in the near future). Let me know if waiting is too inconvenient to you @n0uk .

n0uk commented 6 years ago

@sebcrozet No, I'm already using the patch on production, so can wait forever :). I'm working on .io games and server written on rust + nphysics, that's why I pay so many attention to leaks. My previous game server can work stably for months (using nphysics 0.6), in the new game I'm use updated nphysics and meet oom killer every day. On the "memleak" branch of ncollide it works for two days before killed, so there is still leaks but not sure it's ncollide/nphysics issue (try to investigate on the next week).

It would be great to have some debug counters for memory together with performance counters, just len() of all slabs, and count of all objects.

sebcrozet commented 6 years ago

Sorry for the regression! Those memleaks are indeed problematic and quite critical. I will dedicate time trying to track them down at the end of next week. Thank you for helping investigating this!

sebcrozet commented 6 years ago

@n0uk I've spent the morning searching for more meamleaks but unfortunately did not found any good lead. I tried to observe the memory evolution under adding and removing bodies at a high frequency, for a long time, but without success (the scenarios are basically variations of: https://pastebin.com/t08bDgfu). Here is what could help finding those leaks:

n0uk commented 6 years ago

@sebcrozet Sorry, looks like this time it's not nphysics/ncollide issue, because i can't reproduce it with sample project. Also, i can't reproduce it on local server, only when there many online players doing actions (creating/removing rigidbodies, moving). I'm working on integration memory counters, to get snapshots from the servers, so i'll know for sure soon. I'm use only 2d dynamic/static rigidbodies/sensors, moving/rotating them and querying (raycasts/aabb).