Closed gcolvin closed 3 years ago
After EIP615 died, here is another chance to improve this archaic-ish VM architecture. I myself wholeheartedly wish this can be accepted.
I work on compilers and virtual machines. Here is my personal evaluation:
I also wish SUB
could take a parameter -- making it a 3 byte instruction. I don't understand people's fear of instructions longer than one byte.
I also wish SUB could take a parameter -- making it a 3 byte instruction.
This is one of the reasons EIP-615 run into issues.
I don't understand people's fear of instructions longer than one byte.
I'm really in favor of having multi-byte instructions, but that requires EVM versioning. An example why doing it without versioning is risky is demonstrated here: https://ethereum-magicians.org/t/eip-663-unlimited-swap-and-dup-instructions/3346/11
@axic That specific problem exists already because we have variable opcodes (PUSH
) from the beginning. Also, if the analysis is smart enough, it should be able to detect that we are jumping into the middle of an instruction.
And what is the best way to avoid that? Make dynamic jumps obsolete -- that was in the EIP615 proposal!
I also think EVM versioning will happen sooner or later, unless we want to forever keep EVM as is.
That specific problem exists already because we have variable opcodes (PUSH) from the beginning.
As you say that opcode is there since the beginning, which means every VM and tool is prepared for it.
Also, the PUSH opcodes specify how much immediate data to push, whereas BEGINSUB needs a variable amount of immediate data because different subroutines can take different numbers of arguments.
Edited for longer names, and to give RETURNSUB
the same semantics as #615 -- restoring SP
to that of the caller. This makes it much easier to use and increases forwards compatibility.
Nope, can't use RETURNSUB semantics without BEGINSUB. Sigh. Upside is you can write functions that take and return a variable number of arguments.
BEGINSUB needs a variable amount of immediate data because different subroutines can take different numbers of arguments.
I think only the 2-byte jump target address as immediate operand is good enough, or we can have an additional "number of arguments pushed on stack" byte in BEGINSUB
.
Having static subroutine jumps is big deal, the good thing is that if we do it now, it will not break the existing convention.
So another opcode for tagging the target address of subroutines. This will deliberately disallow previous internal calling convention (using JUMP
/JUMPI
) using the subroutine.
We know that we definitely don't want this kinds of backward compatibilities, but I feel that it will create a little bit of inconsistency.
@gcolvin can you clarify a few things?
The current wording mentions how SP
is initially set but then never mentions it again. Is SP
still relevant to this proposal?
Am I correct in understanding that when encountering a GOSUB
, the entire data stack is pushed to the return stack and then cleared?
And that upon a RETSUB
, the current data stack is disregarded and the entire return stack is pushed to the data stack and cleared? Or only the part of it that came from the last GOSUB
? If so, how are we keeping track of this?
@lialan Thanks. There is now a SUBDEST opcode to tag valid GOSUB targets. There are no restrictions here on how these instructions are used. E.g. multiple entry points, jumps into and out of subroutines, and other abominations are allowed. It's up to language and tool implementers to establish best practices. To the extent that they can be statically checked some things could be validated before runtime per @shemnon' s https://github.com/shemnon/EIPs/pull/1.
@MrChico Thanks. SP is unused, so I've removed it. It's unused because GOSUB and RETSUB have no effect on the data stack--it's up to the caller. This often happens by default on a stack machine: the caller pushes arguments which are consumed by the callee, leaving any results on the stack.
The PR mentioned two comments above is now #2348
@MrChico Thanks. SP is unused, so I've removed it. It's unused because GOSUB and RETSUB have no effect on the data stack--it's up to the caller. This often happens by default on a stack machine: the caller pushes arguments which are consumed by the callee, leaving any results on the stack.
I'm confused. The issue still refers to SP
:
Execution of EVM bytecode begins with SP set to 0,
and your claim that GOSUB
have no effect on the data stack seems to contradict the semantics section which claims:
[GOSUB] sets PC to the address on top of the data stack and pops the data stack.
If this was a PR instead of an issue it would be easier to review and track changes to this proposal
You are right on all counts @MrChico. I thought I had removed the SP reference, but it was still there--it's gone now. My comment on GOSUB was wrong, it's only RETSUB that leaves SP unchanged--I think the semantics section is right. And a PR is needed, but earning a living is taking priority, and editing an issue is easier than dealing with Git. I appreciate your patience.
@MrChico @shemnon @lialan @axic I've pared this down, cleaned it up, and submitted it as a draft: https://github.com/ethereum/EIPs/pull/2484
@gcolvin
There is some inconsistency between the test case and the spec in EIP-2315. Particularly,
The spec says that BEGINSUB is not supposed to be executed and its execution will cause error. JUMPSUB will land on the next instruction after BEGINSUB.
However, the test case (and the current open-ethereum implementation) assumes BEGINSUB can be executed as a noop. JUMPSUB will land on the BEGINSUB instead of the next.
Note that I believe the spec makes more sense from the security perspective. It prevents unintended control flow behavior in EVM crossing routine boundaries.
Can you please comment this on https://ethereum-magicians.org/t/eip-2315-simple-subroutines-for-the-evm/3941 which is the discussion url?
(This issue here should be closed.)
Reopened as working draft. Will push to EIP repo at stable points. I'm coming to see that more can be achieved by way of safety and ease of static analysis with this proposal than I even realized, and am working it out here.
A good summary of the history of subroutines, from Lovelace and Turing onward
and the subroutine facilities provided by many CPUs
including the following:
eip: 2315 title: Simple Subroutines for the EVM description: Introduces two opcodes to support simple subroutines status: Draft type: Standards Track category: Core author: Greg Colvin (@gcolvin), Greg Colvin greg@colvin.org, Martin Holst Swende (@holiman), Brooklyn Zelenka (@expede) discussions-to: https://ethereum-magicians.org/t/eip-2315-simple-subroutines-for-the-evm/3941 created: 2019-10-17 requires: 3540, 3670, 3779, 4200
Abstract
This proposal introduces two opcodes to support simple subroutines:
RJUMPSUB
andRETURNSUB
.Taken together with other recent propoposals it provides an efficient, static, and safe control-flow facility.
Motivation
The EVM does not provide subroutines as primitives.
Instead, calls can be synthesized by fetching and pushing the return address and subroutine address on the data stack and executing a
JUMP
to the subroutine; returns can be synthesized by getting the return address to the top of the stack and jumping back to it. These conventions cost gas and use stack slots unnecessarily. And they create unnecessary complexity that is borne by the humans and programs writing, reading, and analyzing EVM code.Subroutines
As an alternative, we propose to provide for subroutines with two simple operations:
RJUMPSUB
to call a subroutine andRETURNSUB
to return from it.Taken together with other required propoposals they provide an efficient, static, complete, and safe control-flow facility.
Efficient. Substantial reductions in the complexity and the gas costs of calling and optimizing simple subroutines -- as much as 56% savings in gas in the analysis below. Substantial performance advantages for static jumps are reported elsewhere.
Static. All possible jumps are known at contract creation time.
Complete. Together with EIP-4200 it provides a minimal set of control structures for implementing programs -- jumps, conditional jumps, and subroutines.
Safe. Valid programs will not halt with an exception unless they run out of gas or recursively overflow stack.
Prior Art
Facilities to directly support subroutines are provided by all but one of the real and virtual machines we have programmed. This includes physical machines like the Burroughs 5000, CDC 7600, IBM 360, DEC PDP 11 and VAX, Motorola 68000, Sun SPARC, ARM, and a few generations of Intel x86; as well as virtual machines for Lisp, Pascal, Java, Wasm, and the sole exception -- the EVM. The details and complexity vary, but in whatever form these facilities provide for
Over the years, for the machines we have programmed, native subroutine operations have proven their value for efficient implementation and simple coding of subroutines.
Original Design
The first specification of subroutines for a real machine goes back to Turing's Automatic Computing Engine of 1946:
Turings's machine held data in delay lines made of mercury-filled crystals. We have better hardware now, but the concept is simple and by now familiar. In less archaic language, Turing's full specification is that:
This design still sees use by virtual machines for Lisp, Forth, Java, Wasm and others.
Specification
We propose to follow Turing's original specification, as applied to the EVM architecture. We support calls and returns with a
return stack
of return addresses. The EVMreturn stack
, like the EVMdata stack
, is limited to1024
items. Thisreturn stack
is accessed via two new subroutine instructions.Instructions
RJUMPSUB (0x5e) jmpdest
The cost is low.
RETURNSUB (0x5f)
Notes:
PC
to be executed is beyond the last instruction then the opcode is implicitly aSTOP
, which is not an error.return stack
do not need to be validated, since they are alterable only byRJUMPSUB
andRETURNSUB
.return stack
. But the actual state of thereturn stack
is not observable by EVM code or consensus-critical to the protocol. (For example, a node implementer may codeRJUMPSUB
to unobservably pushPC
on thereturn stack
rather thanPC + 1
, which is allowed so long asRETURNSUB
observably returns control to thePC + 1
location.)return stack
is the functional equivalent of Turing's "delay line".The low cost of
RJUMPSUB
versusJUMP
is justified by needing only to add the immediate two byte destination to thePC
and push the return address on thereturn stack
, all using native arithmetric, versus using the data stack with emulated 256-bit instructions.The verylow cost of
RETURNSUB
is justified by needing only to pop thereturn stack
into thePC
. Benchmarking will be needed to tell if the costs are well-balanced.Safety
We define safety here as avoiding exceptional halting states, as defined in the Yellow Paper. A validator can always detect three of these states at validation time:
A validator can detect stack overflow only for non-recursive programs, so two states will still require tests at runtime:
EIP-3779: Safer Control Flow for the EVM specifies initialization-time validation to detect invalid contracts.
Rationale
Below we explore the advantages of have a subroutines as primitives, design alternatives, and the reasons for our choice of specification.
Efficiency Analysis
We show here how subroutine instructions can be used to reduce the complexity and gas costs of both ordinary and optimized subroutine calls compared to using
JUMP
.Simple Subroutine Call
Consider this example of calling a fairly minimal subroutine.
Subroutine call, using
JUMP
:Total: 50 gas
Subroutine call, using
RJUMPSUB
:Total 22 gas.
Using
RJUMPSUB
versusJUMP
saves 50 - 22 = 28 gas -- a 56% improvement.Tail Call Optimization
Of course in cases like this one we can optimize the tail call, so that the return from
SQUARE
actually returns fromTEST_SQUARE
.Tail call optimization, using
JUMP
:Total: 33 gas
Tail call optimization, using
RJUMP
andRETURNSUB
:Total: 17 gas
Using
RJUMPSUB
versusJUMP
saves 33 - 17 = 16 gas -- a 48% improvement.Call Using Data Stack
We can also consider an alternative call mechanism -- call it
DATASUB
-- that pushes its return address on thedata_stack
:Total 26 gas.
Using
DATASUB
versusJUMP
saves 50 - 26 = 24 gas -- a 48% improvement.By comparison, the
RJUMPSUB
version saves 28 gas -- a 56% improvement._Note: The gas cost for
DATASUB
is 6 here versus 5 forRJUMPSUB
because using the wide-integerdata_stack
is less efficient than using a stack of native integers._Conlusion
We can see that these instructions provide a simpler and more gas-efficient subroutine mechanism than using
JUMP
.Clearly, the benefits of these efficiencies are greater for programs that have been factored into smaller subroutines, but a routine could use 200 more gas than our first example and
RJUMPSUB
would still use better than 10% less gas thanJUMP
._Note: A stack rotation operator to move items on the stack and implicitly shift the intervening items could simplify code using
JUMP
. It would be a potentionally expensive operation with a dynamic gas cost._Design Alternatives
There are at least two designs for a subroutine facility.
Turing's design keeps return addresses on a dedicated stack.
As specified above, the instruction to call a subroutine will
And the instruction to return from a subroutine will
We have chosen Turing's design, as have Forth, Java, Wasm, and others. On a stack machine almost all computation is done on the stack, and on these machines the return information is not conveniently
For these machines, an advantage of keeping a separate return stack is that it leaves to the implementation the representation of the data stack, which can remain opaque to the user. All that matters is that calls and returns work.
An alternative design is to keep return addresses on the
data stack
.The instruction to call a subroutine will:
data stack
.The instruction to return from a subroutine will:
data stack
.This design became popular on silicon in the 1970's and remains so. Examples include the PDP 11, VAX, M68000, SPARC, and x86. These are all register machines, not stack machines. The registers are used for computation, and the stack is used by subroutine calls for return addresses and call frames.
For all of these machines instruction addresses fit efficiently into machine words on the
data stack
.We give an example of this alternative design above.
We prefer Turing's design for a few reasons.
data stack
slot for the return address,Keeping code simple is good. And keeping control flow opaque has clear safety advantages -- the state of the VM -- the stack, frame, and instruction pointers -- is not made visible or mutable as data.
Finally, given that the EVM is a stack machine with very wide words, the performance advantages and the decades of successful use of Turing's design by similar machines weighed heavily in our decision.
Backwards Compatibility
These changes affect the semantics of existing EVM code: bytes that would have been interpreted as valid jump destinations may now be interpreted as immediate data. Since this proposal depends on the Ethereum Object Format to signal the change this is not a practical issue.
Test Cases
Simple routine
This should jump into a subroutine, back out and stop.
Bytecode:
0x60045e005b5d
(PUSH1 0x04, JUMPSUB, STOP, JUMPDEST, RETURNSUB
)Output: 0x Consumed gas:
10
Two levels of subroutines
This should execute fine, going into one two depths of subroutines
Bytecode:
0x 00000000000000c5e005b60115e5d5b5d
(PUSH9 0x00000000000000000c, JUMPSUB, STOP, JUMPDEST, PUSH1 0x11, JUMPSUB, RETURNSUB, JUMPDEST, RETURNSUB
)Consumed gas:
20
Failure 1: invalid jump
This should fail, since the given location is outside of the code-range. The code is the same as previous example, except that the pushed location is
0x01000000000000000c
instead of0x0c
.Bytecode: (
PUSH9 0x01000000000000000c, JUMPSUB,
0x6801000000000000000c5e005b60115e5d5b5d, STOP, JUMPDEST, PUSH1 0x11, JUMPSUB, RETURNSUB, JUMPDEST, RETURNSUB
)Failure 2: shallow
return stack
This should fail at first opcode, due to shallow
return_stack
Bytecode:
0x5d5858
(RETURNSUB, PC, PC
)Subroutine at end of code
In this example. the JUMPSUB is on the last byte of code. When the subroutine returns, it should hit the 'virtual stop' after the bytecode, and not exit with error
Bytecode:
0x6005565b5d5b60035e
(PUSH1 0x05, JUMP, JUMPDEST, RETURNSUB, JUMPDEST, PUSH1 0x03, JUMPSUB
)Consumed gas:
30
Security Considerations
These changes introduce new flow control instructions. They do not introduce any new security considerations. In concert with EIP-3779: Safer Control Flow for the EVM they will increase security by providing for validated control flow.
References
A.M. Turing, "Proposals for the development in the Mathematics Division of an Automatic Computing Engine (ACE)." Report E882, Executive Committee, NPL February 1946. Available: http://www.alanturing.net/turing_archive/archive/p/p01/P01-001.htm
B.E. Carpenter, R.W. Doran, "The other Turing machine." The Computer Journal, Volume 20, Issue 3, January 1977. Available: https://doi.org/10.1093/comjnl/20.3.269
Copyright
Copyright and related rights waived via CC0.