Open dotnwat opened 12 months ago
Good catch. Almost 10 years ago we had the same bug in the posix_memalign()
implementation in the OSv project, and fixed it in https://github.com/cloudius-systems/osv/commit/c8845bb544b37438622bf2e7b8bcd7c7874dda93. Our workaround there did something simple - it just increased the requested size to match the alignment. Maybe we should do the same here, unless you can think of a better way.
it just increased the requested size to match the alignment.
I did the same thing, but had the luxury of noticing it and controlling the call sites, so adding that strategy to Seastar could work if the wasted space wasn't a concern. I don't know enough about how the allocator works to suggest something more precise.
Yes, I meant the solution in that 10-year-old patch was indeed to increase the requested size inside posix_memalign()
- not in all its callers.
I know it looks like it wastes space, but I'm not sure we have a better solution - imagine that you ask for an allocation of size 16 aligned at 32 bytes. If we take a 32-byte-aligned chunk and pretend to have allocated only the first 16 bytes of it, that would be fine, but what will we do with the other 16-byte half? How will we remember we haven't allocated this yet? I guess it's possible to build a linked list of such half-allocated chunks, but I wonder if it's worth the effort (in my experience this use case is very rare - is it common in your application? If it's rare, it must be done correctly but utmost efficiency is less important). Maybe @avikivity will have a better idea.
Although I can imagine approaches that don't waste the space, the waste approach seems perfect here because it's only happening in the unusual case that someone actually asks for alignment > size, which we believe must be quite rare since no-one has complained of it not working in almost a decade.
System
posix_memalign
allows allocation sizes that are less than the requested alignment. However, with Seastar's implementation the allocation is performed without error, and the resulting allocation may not be aligned.For users that are using posix_memalign to allocate memory for DMA this inconsistency is unlikely to ever be encountered, but as more and more non-Seastar code (e.g. JIT) is allowed to run on reactor with the Seastar allocator implementation the more likely it is to encounter code which assumes full compliance.
Here is a reproducer for the issue. Works in debug builds with system allocator, fails in release builds with Seastar allocator.