AdvancedPhotonSource / tike

Repository for ptychography software
http://tike.readthedocs.io
Other
29 stars 15 forks source link

Fix memory leak due to CuPy automatic FFT Cache #192

Closed carterbox closed 2 years ago

carterbox commented 2 years ago

Purpose

Fixes #187. Because we use a modified ThreadPool.map() to ensure that CuPy arrays are always operated on in the correct device context, the automatic FFT plan cache from CuPy does not work correctly. This is because the cache is not (cannot be?) cleared automatically when the thread pool is destroyed because the threads don't know when they will be destroyed, so it exists beyond the life of the threads in the ThreadPool. This leads to a memory leak because if a new Comm object is created, then new threads create new FFT caches that are never destroyed. There is no way to disable the automatic cache globally (that I know of).

Approach

We solve this problem by reverting to our custom cache which only caches per device and is directly tied to the life of the Operator using the FFTs. This way, the cache is always destroyed when the Operator goes out of scope regardless of whether the threads are alive/dead.

Pre-Merge Checklists

Submitter

Reviewer

pep8speaks commented 2 years ago

Hello @carterbox! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2022-02-16 21:42:08 UTC
stevehenke commented 2 years ago

This bug fix introduces a custom FFT cache to fix a memory leak. The code modifications are clean and tests are passing on "fish". Thank you for your contribution!