iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.86k stars 622 forks source link

Remove `pooling_allocator` from `iree_hal_buffer_t`. #19159

Open benvanik opened 1 week ago

benvanik commented 1 week ago

The caching allocator currently requires an unsavory poke into iree_hal_buffer_t that lets it point the buffer recycle procedure back to itself. This is the only usage of the field in the codebase and there are probably better ways to accomplish the notification.

The current hack was added so that buffers allocated from underlying device allocators by the caching allocator can always be directly routed back from whatever granularity the underlying device allocator returned. Since the underlying allocator may return subspans (or any other kind of non-allocated buffer) and all HAL buffers besides allocated buffers are intended to be disposable adding release callbacks and other mechanisms would be insufficient. A caching allocator that wrapped the returned device buffers in a new iree_hal_my_pooled_buffer_t and returned it would get disconnected on the first subspan of the buffer that reaches down to the allocated buffer.

A solution is to have the iree_hal_my_pooled_buffer_t be the allocated buffer and hide the actual device allocation. Currently without dynamic casts this would break all HAL implementations that require that they can get their internal buffer implementations from buffers passed in. With a dynamic cast implementation we could have the pooled buffers act as allocated buffers but still route cast queries to the device-provided buffers.

We've need dynamic casts for awhile, had them briefly until we hit the annoying dynamic linking issues, and it is time to add them back.