Currently, workers are responsible for allocating their own memory that they contribute to a pool for their worker group. There are a few issues around this:
Doesn't work well for non-constant input sizes
Potentially a lot of extra memory is kept around
Can't easily use hardware-pinned memory
One solution is a central memory manager.
Goals:
Allow for dynamic batching (#131) by allowing the batcher to return unused memory
Allow for hardware backed memory but also CPU based memory
Currently, workers are responsible for allocating their own memory that they contribute to a pool for their worker group. There are a few issues around this:
One solution is a central memory manager.
Goals: