This converts the stack machine in the interpreter from two completely independent stacks (unboxed and boxed) into a single logical stack with unboxed and boxed slots.
The existing unboxed stack was only really used for tags and branches and not for arithmetic. This refactor should allow us to better leverage the usage of unboxed data as well.
~Note: This change in itself is not expected to result in a performance improvement, it is a refactor necessary to enable future perf improvements.~
Turns out we got a 1.1x -> 1.25x perf improvement for free!
Implementation notes
Unify stacks into a single structure, removing the need for the Memory-specific typeclass instances.
Any given stack slot is EITHER unboxed or boxed, most ops know which stack to look on, but if you don't know, you can check the boxed side and if it's a BlackHole Closure that means to use the unboxed side instead.
Making this change meant that where before the stacks were separate, now arguments can be interleaved, so it took a lot of care to ensure unboxed tags were being placed and handled correctly.
Interesting/controversial decisions
Nah not really.
Test coverage
I'd like to add some round-trip property tests for our MCode and ANF serialization layers, would've saved me a few hours of staring, and this error was barely caught by tests, I was really lucky. This doesn't need to wait for those though.
[x] Tested by existing transcripts.
[x] Run the nimbus tests
[x] Run a test build that ensures all stack pokes were pre-bumped
[x] Run a test build that ensures that all stack peeks were definitely from the correct stack value
[x] Run test and test.io.all in base
Loose ends
There are several perf improvements we can make once this is in.
Overview
This converts the stack machine in the interpreter from two completely independent stacks (unboxed and boxed) into a single logical stack with unboxed and boxed slots.
The existing unboxed stack was only really used for tags and branches and not for arithmetic. This refactor should allow us to better leverage the usage of unboxed data as well.
~Note: This change in itself is not expected to result in a performance improvement, it is a refactor necessary to enable future perf improvements.~ Turns out we got a
1.1x -> 1.25x
perf improvement for free!Implementation notes
Interesting/controversial decisions
Nah not really.
Test coverage
I'd like to add some round-trip property tests for our MCode and ANF serialization layers, would've saved me a few hours of staring, and this error was barely caught by tests, I was really lucky. This doesn't need to wait for those though.
test
andtest.io.all
in baseLoose ends
There are several perf improvements we can make once this is in.