This PR addresses the performance regressions from #297.
I explored various solutions and landed on using reference counted buffers to help manage resources internally. I explored exposing a new API to customize memory management but overall it seemed that starting with an internal-only solution at first would be lower risk.
Overall, some benchmarks are a bit slower, some a bit faster, here is a comparison to the main branch:
This PR addresses the performance regressions from #297.
I explored various solutions and landed on using reference counted buffers to help manage resources internally. I explored exposing a new API to customize memory management but overall it seemed that starting with an internal-only solution at first would be lower risk.
Overall, some benchmarks are a bit slower, some a bit faster, here is a comparison to the
main
branch: