AdvancedPhotonSource / tike

Repository for ptychography software
http://tike.readthedocs.io
Other
29 stars 15 forks source link

NEW: Use CuPy fuse to merge some reduction kernels #290

Closed carterbox closed 1 year ago

carterbox commented 1 year ago

Purpose

Use CuPy fuse to merge more function calls into single kernels which reduces the use of intermediate memory arrays and the number of kernel launches.

Approach

On a benchmarking dataset, the memory profile starts with 15.29GB of GPU memory acquired. After this PR, only 13.88GB of memory acquired.

Pre-Merge Checklists

Submitter

Reviewer