Closed lastephey closed 1 year ago
Based on current profiling it looks like memory transfer is currently a small fraction of the runtime.
During the hackathon it was happening after each patch. Now it happens at the end of each bundle which is a different situation.
If memory transfer becomes bottleneck, consider pinned memory.
Add comment in the code to remind future us that pinned memory is an option
Once the cpu version is fully ready, pin gpu memory via cupy at the end of each patch and bundle. We already learned to do this in the March 2020 hackathon so we can apply the same strategy.
This should substantially speed up memory movement at the end of each patch/bundle since we'll be using preallocated pinned memory instead of pageable memory.
Note that this requires Issue #4 and Issue #5 to be complete since we cannot preallocate the pinned memory until we know exactly the array sizes we need. This also requires the array size to remain constant in every patch.