To continue improving Plume runtime performance, we might switch the bytecode assembler to a register-allocated bytecode. This representation would better fit the actual CPU + RAM design and then provide better performance. Such a system enables better performance and optimizations because more actions are done with fewer instructions.
For instance, taking a simple add expression with a stack-based bytecode:
LOAD_GLOBAL 0
LOAD_CONSTANT 1
ADD
This is the generated bytecode for x + 1
Would become with a register-based bytecode:
ADD ra #1 -- ra represents the global 0 stored in the ra register
As you can see, one requires three instructions to process and compute the result whereas the other only requires one instruction. And the runtime logic doesn't change that much, instead of operating on variables with indexes, we have already computed value information: we know when decoding an instruction, what we are expecting from the arguments. So it's just a little bit of bit shifting to decode instructions.
To continue improving Plume runtime performance, we might switch the bytecode assembler to a register-allocated bytecode. This representation would better fit the actual CPU + RAM design and then provide better performance. Such a system enables better performance and optimizations because more actions are done with fewer instructions.
For instance, taking a simple
add
expression with a stack-based bytecode:Would become with a register-based bytecode:
As you can see, one requires three instructions to process and compute the result whereas the other only requires one instruction. And the runtime logic doesn't change that much, instead of operating on variables with indexes, we have already computed value information: we know when decoding an instruction, what we are expecting from the arguments. So it's just a little bit of bit shifting to decode instructions.