NVIDIA / warp

A Python framework for high performance GPU simulation and graphics
https://nvidia.github.io/warp/
Other
4.28k stars 243 forks source link

[QUESTION] Deallocate/Free Memory from warp.sim #290

Closed knauth closed 3 months ago

knauth commented 3 months ago

After a sim is completed in warp.sim, how can I free the memory it allocates? Deleting the python object doesn't do anything, nor does scoping it. I can't find anything related to freeing memory in the docs.

shi-eric commented 3 months ago

Hi @knauth, can you please provide a small example that illustrates the issue?

knauth commented 3 months ago

Hey @shi-eric - sure. I have a "MeshSim" object which builds and simulates a model:

class MeshSim:
    def __init__(self, mesh_path: os.path):
        self.bd = wp.sim.ModelBuilder()
        self.mesh_points, self.mesh_indices = wp.sim.load_mesh(mesh_path)
        self.mesh_points = [[y, z, x] for x, y, z in self.mesh_points]
        mesh = wp.sim.Mesh(self.mesh_points, self.mesh_indices)

        mi = self.bd.add_body(origin=wp.transform([0, 20, 0], wp.quat_identity()))
        ms = self.bd.add_shape_mesh(mi, mesh=mesh, scale=[0.1, 0.1, 0.1])

        self.model = self.bd.finalize()

        [etc...]

I run that sim using the usual methods, here's how it's called in my main.py:

ms = MeshSim(meshPath)
transform = ms.run(500)

I want to then run another sim which is constructed the same way. The problem is that this sim uses ~4gb of VRAM (I have manually set the memory allocated as part of a workaround for the quadratic memory requirements of ModelBuilder.finalize()). I'd like to release that memory so it can be used by the second sim. I've tried placing the class instantiation in a function so that it moves out of scope, and manually deleting the object using del. Nothing I've tried gets Warp to release that GPU memory so it can be reallocated, short of killing the Python process itself, which is not something I want to do in production.

Is there a way to manually instruct warp to free/deallocate the memory the Model allocates?

knauth commented 3 months ago

For clarification- my workaround only involves manually setting the num_rigid_contacts_per_env and then a small tweak to the broadphase kernel to avoid keeping the shape-shape cartesian products in memory. There's no changes to how the actual memory is allocated/deallocated.

shi-eric commented 3 months ago

Hey @knauth, as an experiment can you add a wp.synchronize() after the del ms? I wonder if you're seeing the effects of the CUDA stream-ordered memory allocator not releasing memory until a stream-synchronization event occurs.

For debugging purposes you can also print out the value of wp.get_device().free_memory / (1024*1024*1024) at various points to get the free device memory in GiB.

knauth commented 3 months ago

@shi-eric, excellent! This resolves the issue and the memory is properly freed after the Python object is deleted. Thanks for the help.

shi-eric commented 3 months ago

Awesome! Glad to know it helped!