Closed dustanddreams closed 2 weeks ago
Sure, done.
The CI reports a MSCV 32bit failure, the first invocation of ocamlc
in the build fails. I would be surprised if this was related to the PR, but who knows.
The msvc32 CI is still failing. @dustanddreams, can you figure out why?
I'm afraid I can't help much here - the logs are unhelpful and I don't have the ability to tinker with MSVC - which means I can't rule out a regression introduced by this PR.
I confirm that I get the build failure with this PR on msvc32 locally too, and that trunk builds fine. This is a regression. Inspecting the failing invocation doesn't reveal anything.
make -C stdlib OCAMLRUN='$(ROOTDIR)/boot/ocamlrun.exe' USE_BOOT_OCAMLC=true all
make[1]: Entering directory '/cygdrive/c/Users/misterda/Tarides/ocaml/trunk/stdlib'
../boot/ocamlrun.exe ../boot/ocamlc -verbose -strict-sequence -absname -w +a-4-9-41-42-44-45-48 -g -warn-error +A -bin-annot -nostdlib -principal -nopervasives -no-alias-deps -w -49 -pp "gawk -f ./expand_module_aliases.awk" -c stdlib.mli
+ gawk -f ./expand_module_aliases.awk "stdlib.mli" > C:\cygwin64\tmp\ocamlppfb5f84
make[1]: *** [Makefile:121: stdlib.cmi] Error 127
make[1]: Leaving directory '/cygdrive/c/Users/misterda/Tarides/ocaml/trunk/stdlib'
make: *** [Makefile:673: coldstart] Error 2
I think I found the reason why it doesn't work with MSVC.
We are allocating roundup(needed space, 16) + sizeof handler
. And then we use roundup(stack base + needed space, 16)
. If the needed space was a multiple of 16, but stack base is not (e.g. when not using mmap
), this will overflow.
I'll fix this ASAP.
My current reviewer state is the following:
In particular, I wonder if we change the behavior compared to the trunk code, and whether this is a good change. (My guess: we do pessimize the behavior slightly, but we are talking about very small amount of wasted space, and most configurations use mmap anyway where there is no change.)
I tried to read the explanations of @damiendoligez in #11779, but honestly this reads like gibberish to me. @dustanddreams, can you confirm that you do understand (2), and comment on the change of behavior between trunk and your PR?
The trunk code always asks for an extra 8 bytes in alloc_for_stack
.
When using mmap
, the granularity of the allocation is MMU pages, so it is extremely unlikely to hit a case where the computation of the 16-byte alignment overflows the actual allocation.
When not using mmap
(i.e. under Windows), it is very likely that the returned address was aligned to 8 bytes (since we are on a 64-bit platform). By asking for an extra 8 bytes in the allocation, we would cover the worst case when rounding up to a 16-byte boundary.
I'm convinced, but this has gone from "simple refactoring" to "why are there so many snakes?", so I think this needs a Changes entry.
There can't be snakes, Saint Patrick drove them out!
This PR intends to fix #11779, by making stack rounding computations correct and explicit.
The first concern of #11779 had been addressed in #12434; this PR cleans up a bit and addresses the second concern.