Closed dmik closed 4 years ago
Tests of the alignas
problem lead us to SharedBufferDispatcher::ValidateCreateOptions
in chromium/mojo/core/shared_buffer_dispatcher.cc
. It gets an address of a alignas(8) MojoCreateSharedBufferOptions
variable allocated on the stack in Core::CreateSharedBuffer
(chromium/mojo/core/core.cc
). Sometimes, the address of this variable doesn't end with 0x0 or 0x8 as it should be because of alignas(8)
but is something like 0x43ee034
or 0x43ede0c
(i.e. unaligned). I created a separate GCC ticket for that issue as it might be a GCC problem.
Further tests demonstrated that using alignas(16)
makes it work (i.e. GCC doesn't seem to break alignment in this case) but unfortunately this changes the size of some structs. GCC seems to round the structure size up to the closest aligned value from the sizeof
POV and Mojo (which packs data when transferring it between processes) is very picky about the exact struct sizes by using static assertions (which prevent compilation).
Currently I see no solution to that other than disable this alignment check completely in Mojo (the only alternative is to fix GCC but this might be a long road). I don't see any direct consequences of disabling this align check ATM (like using the variable address in some math or so) but I may be unaware of something, I don't know this code by heart. Anyway, no other choice for now, as already mentioned.
Regarding the first problem, I added some extra logging, caught it again and now I have a proof that DosAllocMem(OBJ_ANY | PAG_READ)
returns an address which is NOT on a 64K boundary (e.g. 0x22409000). I've never seen that before. And to my understanding this contradicts with what docs say:
Note: DosAllocMem and DosAllocSharedMem both allocate a block of memory of the size requested rounded up to the nearest page. On OS/2 Warp, the system allocates a 64K object without attributes on every allocation.
The docs don't explicitly mention that 64K objects start on a 64K boundary but that's kinda assumed... Sigh. I'm really confused here. If I won't find any detail on that, I will just set page alloc granularity to 4K on OS/2 (i.e. equal to the page size).
I've read the 64K object size comment too. However, based on your findings I have to suspect the docs are not entirely correct. It appears that the 64K object size may apply only to tiled memory.
Looking at the kernel code, I can see where the input size is rounded up to 4KB. There is no explicit rounding to 64KB, so it is most likely triggered by one of the allocation flags.
One thing I have learned over the years is that the IBM docs are pretty good, but at the same time they have not been updated on almost 30 years so there is bound to be errata. When behavior does not seem to match the docs, the only option is a testcase or two.
Steven, thanks for the feedback. Note that there are no special flags besides OBJ_ANY (the code does never use OBJ_TILE). So it's always OBJ_ANY | PAG_READ [ | PAG_WRITE etc ] and in a series of previous allocations (around 50) these calls return addresses ending with 0000. Anyway, I will just change the granularity to 4K, given what you say.
And yes, I generally used to not trusting docs but doing some test cases :)
The behavior has nothing specific to do with the OBJ_ flags you are using. For the testcase, I am using PAG_WRITE | PAG_COMMIT | OBJ_ANY.
The testcase indicates that the alignment is 4KB. Since the runtime heaps, typically allocate 64KB chunks. the first DosAllocMem is aligned on a 64KB boundary, at least in the testcase. As long as subsequent allocations are 64KB multiples the blocks are 64KB aligned. After an allocation for less than 62KB+1 occurs, the alignment will be 4KB and will continue this way even when 64KB blocks are allocated.
At some point, I may test how OBJ_TILE affects the alignment just so I more fully understand the behavior.
I got a thought that the behavior differs for high memory and for low memory (and that for low memory it matches what the docs say). So I had to make my own small test case to check that. And yes, this assumption is right.
Without OBJ_ANY, memory objects are allocated from low memory, allocation granularity is 64 KB, the returned addresses are aligned to 64 KB (end with 0000) regardless of the object size and the remaining pages within the 64 KB block are simply not committed (i.e. wasted, unless the application commits them manually). Exactly the same behavior is shown when using OBJ_TILE (i.e. it looks like it's simply ignored).
With OBJ_ANY, memory objects are allocated from high memory, allocation granularity is 4 KB (i.e. equal to page size), the returned addresses are aligned to 4 KB (end with 000).
When allocating shared memory (DosAllocSharedMem
), the picture is just the same with the exception that sequential allocations go down in terms of address values (which is expected given that shared memory starts on top and grows down).
Another interesting observation is that if you call DosQueryMem
on a free (unallocated) low memory address, it returns error 487 (ERROR_INVALID_ADDRESS). However, for free high memory it succeeds and returns 0x4000 in flags (PAG_FREE). Might save someone some hair.
Ok, case closed then. We have to use 4 KB as allocation granularity and expect returned addresses have a 4 KB alignment. Note that with this applied to the source code, I have Chromium run much further, it could even start playing some YouTube video (but crashed due to some missing audio interface assertions, will create a ticket for that later).
Chromium complains about memory alignment in some cases and crashes the application in both debug and release builds (see https://github.com/bitwiseworks/qtwebengine-os2/issues/6#issuecomment-680286782). Here are the relative console messages:
The first assertion is triggered because
base::FreePages
is supplied a memory address which is not on a 64K (memory object) boundary, e.g. something like0x2131d000
. I wonder how this is possible because the page allocator API inbase
usesDosAllocMem
and that should always returns objects aligned on a 64K boundary. Deeper debugging is needed.As for the second one, it's a bit more tricky. The code requires some C++ struct to be allocated (on a C heap I suppose) with its alignment in mind (as returned by GCC's
__alignof__
). I have no idea why it's not the case. Also, more debugging is needed.