ocaml / ocaml

The core OCaml system: compilers, runtime system, base libraries
https://ocaml.org
Other
5.19k stars 1.06k forks source link

[runtime] make rounding computations more explicit #13117

Closed dustanddreams closed 2 weeks ago

dustanddreams commented 3 weeks ago

This PR intends to fix #11779, by making stack rounding computations correct and explicit.

The first concern of #11779 had been addressed in #12434; this PR cleans up a bit and addresses the second concern.

dustanddreams commented 3 weeks ago

Sure, done.

gasche commented 3 weeks ago

The CI reports a MSCV 32bit failure, the first invocation of ocamlc in the build fails. I would be surprised if this was related to the PR, but who knows.

gasche commented 3 weeks ago

The msvc32 CI is still failing. @dustanddreams, can you figure out why?

dustanddreams commented 3 weeks ago

I'm afraid I can't help much here - the logs are unhelpful and I don't have the ability to tinker with MSVC - which means I can't rule out a regression introduced by this PR.

MisterDA commented 3 weeks ago

I confirm that I get the build failure with this PR on msvc32 locally too, and that trunk builds fine. This is a regression. Inspecting the failing invocation doesn't reveal anything.

make -C stdlib OCAMLRUN='$(ROOTDIR)/boot/ocamlrun.exe' USE_BOOT_OCAMLC=true all
make[1]: Entering directory '/cygdrive/c/Users/misterda/Tarides/ocaml/trunk/stdlib'
../boot/ocamlrun.exe ../boot/ocamlc -verbose -strict-sequence -absname -w +a-4-9-41-42-44-45-48 -g -warn-error +A -bin-annot -nostdlib -principal  -nopervasives -no-alias-deps -w -49  -pp "gawk -f ./expand_module_aliases.awk" -c stdlib.mli
+ gawk -f ./expand_module_aliases.awk "stdlib.mli" > C:\cygwin64\tmp\ocamlppfb5f84
make[1]: *** [Makefile:121: stdlib.cmi] Error 127
make[1]: Leaving directory '/cygdrive/c/Users/misterda/Tarides/ocaml/trunk/stdlib'
make: *** [Makefile:673: coldstart] Error 2
dustanddreams commented 3 weeks ago

I think I found the reason why it doesn't work with MSVC.

We are allocating roundup(needed space, 16) + sizeof handler. And then we use roundup(stack base + needed space, 16). If the needed space was a multiple of 16, but stack base is not (e.g. when not using mmap), this will overflow.

I'll fix this ASAP.

gasche commented 2 weeks ago

My current reviewer state is the following:

  1. I understand the new code and I believe that it is correct
  2. but I don't understand why the trunk code worked in situations where the previous iteration of this PR fails, so I wonder if I am missing something

In particular, I wonder if we change the behavior compared to the trunk code, and whether this is a good change. (My guess: we do pessimize the behavior slightly, but we are talking about very small amount of wasted space, and most configurations use mmap anyway where there is no change.)

I tried to read the explanations of @damiendoligez in #11779, but honestly this reads like gibberish to me. @dustanddreams, can you confirm that you do understand (2), and comment on the change of behavior between trunk and your PR?

dustanddreams commented 2 weeks ago

The trunk code always asks for an extra 8 bytes in alloc_for_stack.

When using mmap, the granularity of the allocation is MMU pages, so it is extremely unlikely to hit a case where the computation of the 16-byte alignment overflows the actual allocation.

When not using mmap (i.e. under Windows), it is very likely that the returned address was aligned to 8 bytes (since we are on a 64-bit platform). By asking for an extra 8 bytes in the allocation, we would cover the worst case when rounding up to a 16-byte boundary.

gasche commented 2 weeks ago

I'm convinced, but this has gone from "simple refactoring" to "why are there so many snakes?", so I think this needs a Changes entry.

dustanddreams commented 2 weeks ago

There can't be snakes, Saint Patrick drove them out!