Open shoaibkamil opened 4 years ago
Have we ever refactored the cache code into something common, or are we reimplementing it in each gpu backend?
5000 :fireworks:
Looking at how the cuda allocation cache works, I think we could refactor that into a generic helper that can be used by other backends. This way there's something readily available for everybody, and if one wants to be fancy in a backend and perform sub-allocation management and things like that, they also have the option to specialize it in the backend.
We should implement an allocation cache similar to that on other GPU APIs.