EIP 101 (Serenity): Currency and Crypto Abstraction

vbuterin commented 8 years ago

Title

  EIP: 101
  Title: Serenity Currency and Crypto Abstraction
  Author: Vitalik Buterin <v@buterin.com>
  Status: Active
  Type: Serenity feature
  Created: 2015-11-15

Specification

Accounts now have only two fields in their RLP encoding: code and storage.
Ether is no longer stored in account objects directly; instead, at address 0, we premine a contract which contains all ether holdings. The eth.getBalance command in web3 is remapped appropriately.
msg.value no longer exists as an opcode, and neither does tx.gasprice
A transaction now only has four fields: to, startgas, data and code.
Aside from an RLP validity check, and checking that the to field is twenty bytes long, the startgas is an integer, and code is either empty or hashes to the to address, there are no other validity constraints; anything goes. However, the block gas limit remains, so miners are disincentivized from including junk.
Gas is charged for bytes in code at the same rate as data.
When a transaction is sent, if the receiving account does not yet exist, the account is created, and its code is set to the code provided in the transaction; otherwise the code is ignored.
A tx.gas opcode is added alongside the existing msg.gas at index 0x5c; this new opcode allows the transaction to access the original amount of gas allotted for the transaction

Note that ECRECOVER, sequence number/nonce incrementing and ether are now nowhere in the bottom-level spec (NOTE: ether is going to continue to have a privileged role in Casper PoS). To replicate existing functionality under the new model, we do the following.

Simple user accounts can have the following default standardized code:

# We assume that data takes the following schema:
# bytes 0-31: v (ECDSA sig)
# bytes 32-63: r (ECDSA sig)
# bytes 64-95: s (ECDSA sig)
# bytes 96-127: sequence number (formerly called "nonce")
# bytes 128-159: gasprice
# bytes 172-191: to
# bytes 192+: data

# Get the hash for transaction signing
~mstore(0, msg.gas)
~calldatacopy(32, 96, ~calldatasize() - 96)
h = sha3(96, ~calldatasize() - 96)
# Call ECRECOVER contract to get the sender
~call(5000, 3, [h, ~calldataload(0), ~calldataload(32), ~calldataload(64)], 128, ref(addr), 32)
# Check sender correctness
assert addr == 0x82a978b3f5962a5b0957d9ee9eef472ee55b42f1
# Check sequence number correctness
assert ~calldataload(96) == self.storage[-1]
# Increment sequence number
self.storage[-1] += 1
# Make the sub-call and discard output
~call(msg.gas - 50000, ~calldataload(160), 192, ~calldatasize() - 192, 0, 0)
# Pay for gas
~call(40000, 0, [SEND, block.coinbase, ~calldataload(128) * (tx.gas - msg.gas + 50000)], 96, 0, 0)

This essentially implements signature and nonce checking, and if both checks pass then it uses all remaining gas minus 50000 to send the actual desired call, and then finally pays for gas.

Miners can follow the following algorithm upon receiving transactions:

Run the code for a maximum of 50000 gas, stopping if they see an operation or call that threatens to go over this limit
Upon seeing that operation, make sure that it leaves at last 50000 gas to spare (either by checking that the static gas consumption is small enough or by checking that it is a call with msg.gas - 50000 as its gas limit parameter)
Pattern-match to make sure that gas payment code at the end is exactly the same as in the code above.

This process ensures that miners waste at most 50000 gas before knowing whether or not it will be worth their while to include the transaction, and is also highly general so users can experiment with new cryptography (eg. ed25519, Lamport), ring signatures, quasi-native multisig, etc. Theoretically, one can even create an account for which the valid signature type is a valid Merkle branch of a receipt, creating a quasi-native alarm clock.

If someone wants to send a transaction with nonzero value, instead of the current msg.sender approach, we compile into a three step process:

In the outer scope just before calling, call the ether contract to create a cheque for the desired amount
In the inner scope, if a contract uses the msg.value opcode anywhere in the function that is being called, then we have the contract cash out the cheque at the start of the function call and store the amount cashed out in a standardized address in memory
In the outer scope just after calling, send a message to the ether contract to disable the cheque if it has not yet been cashed
Rationale

This allows for a large increase in generality, particularly in a few areas:

Cryptographic algorithms used to secure accounts (we could reasonably say that Ethereum is quantum-safe, as one is perfectly free to secure one's account with Lamport signatures). The nonce-incrementing approach is now also open to revision on the part of account holders, allowing for experimentation in k-parallelizable nonce techniques, UTXO schemes, etc.
Moving ether up a level of abstraction, with the particular benefit of allowing ether and sub-tokens to be treated similarly by contracts
Reducing the level of indirection required for custom-policy accounts such as multisigs

It also substantially simplifies and purifies the underlying Ethereum protocol, reducing the minimal consensus implementation complexity.

Implementation

Coming soon.

alexvandesande commented 8 years ago

What happens to contracts that use msg.value? Do they get automatically translated into the new abstraction?

vbuterin commented 8 years ago

Yep, every feature that gets removed should be auto-translateable.

However, note that this does require some care on the part of developers: particularly, anyone developing ethereum contracts now should use static jumps ONLY, not dynamic jumps (eg. PUSH <val> JUMPand PUSH <val> JUMPI are okay, PUSH 32 MLOAD JUMP is not).

Smithgift commented 8 years ago

I take it this could allow for "calling collect" with sophisticated enough miners? (i.e. you just ping some random contract and it will pay for its own execution at no cost to you.) That would be awesome, considering the amount of shenanigans it takes to do the equivalent currently. See also: paying gas/mana with non-ether currencies.

Also, +1 for putting ether and subcoins on the same footing. I'm working on a simple contract (mostly for fun) to bridge that gap.

romanman commented 8 years ago

what is the target time to include it?

vbuterin commented 8 years ago

I take it this could allow for "calling collect" with sophisticated enough miners? (i.e. you just ping some random contract and it will pay for its own execution at no cost to you.)

Exactly. The goal with the above recommended miner software implementation is that if the miner sees a proof that they will get paid within 50000 steps, then they just go ahead and do it, so you should not even need to pre-arrange much of anything.

vbuterin commented 8 years ago

what is the target time to include it?

Serenity, ie. same time as Casper.

gavofyork commented 8 years ago

i'm in favour of bringing it forward to homestead-era, in preparation for serenity.

PeterBorah commented 8 years ago

Exactly. The goal with the above recommended miner software implementation is that if the miner sees a proof that they will get paid within 50000 steps, then they just go ahead and do it, so you should not even need to pre-arrange much of anything.

This is a huge benefit that makes everything worth it, in my opinion.

gavofyork commented 8 years ago

indeed.

Smithgift commented 8 years ago

I have a concern about contract creation under this model. Currently, two contracts may have identical code but extremely different data, but in this case they would have to be the same contract. Think of a modern contract with the "owner" modifier, or the standard metacoin that gives the creator a zillion gizmos. Once I create an owned contract with a given code, no one else can make an identical contract with them as the owner. I don't think hardcoding the owner's signature is a good alternative, as then how do you change owners?

wanderer commented 8 years ago

since we are down to code and storage in an account we could just put code in storage. For example store code at location 0. Then we would have a single merkle tree. To load code we would load program_address.concat(0) from the tree. And to load from storage index "test" we load program_address +"test" and so on.

vbuterin commented 8 years ago

@wanderer that is a good idea in principle, but it depends on the ability to store a single code chunk of arbitrary size in storage, which would be a separate EIP (that I would support as a serenity change).

wanderer commented 8 years ago

@vbuterin but we can store arbitrary sizes in the merkle tree. I'm saying the we just need one merkle tree and not a separate root for the storage. Now from within the EVM you don't have accesses to more than one word which is not that nice. But the execution environment has to load the code and give to the evm as it stand now anyways, so being able to access the code from within the EVM is a not big concern yet.

vbuterin commented 8 years ago

Right, but it seems ugly to store code sequentially. Also, there are space efficiency reasons to have code be in one big chunk; that was the original reason to do it that way as I recall.

wanderer commented 8 years ago

i'm going to back on the idea of storing code at zero. We don't need that. Just store the codehash at the address. Then for storage just append the storage key to the address.

<address> = codeHash
<address> + 'test' = Storage key 'test'

chriseth commented 8 years ago

Code should not be stored in storage, it has to be immutable.

wanderer commented 8 years ago

its doesn't have to be immutable, but there is not many use case for mutable code yet. Interpreters that JIT often need mutable code. But we don't have interpreters running on ethereum yet :P. And it's easier to just store the code directly at address.

chriseth commented 8 years ago

@wanderer I hope that correctness and verifyability has a higher priority than speed, here. Interpreters that compile just in time also do not need mutable code. It is fine for them to call newly created code, and that works perfectly with CREATE and CALLCODE.

vbuterin commented 8 years ago

My instinct at this point is to retain the "immutable code + mutable storage" dichotomy that we currently have.

aakilfernandes commented 8 years ago

:+1: putting ether on equal footing as other tokens

aakilfernandes commented 8 years ago

I take it this could allow for "calling collect" with sophisticated enough miners?

I don't understand this. Doesn't block.coinbase.send(x) already provide everything we need for contracts that pay their own gas?

Smithgift commented 8 years ago

@aakilfernandes

Good point. But, as Vitalik pointed out, this would allow such a system to work easier. (Just set up your contract with the standard payment code and you're good.)

Concern: What if a contract appears to be able to pay for its own gas, but at the last moment shoves the ether it has to another contract? The miner can immediately refuse to continue the transaction, but that doesn't refund the miner or cost the contract.

I think a reasonable solution is for the miner to insure that the actual desired, and gas limited, call is right before the payment code, no matter what.

coder5876 commented 8 years ago

Note that ECRECOVER, sequence number/nonce incrementing and ether are now nowhere in the bottom-level spec

So does this mean that ECRECOVER (for different curves) would need to be implemented in Solidity in a special library contract? Seems like that would be very expensive to call, no?

vbuterin commented 8 years ago

Ah, sorry. It would exist as a precompile at address 3.

BTW for everyone's curiosity, ECRECOVER has been implemented in Serpent already. It costs ~700k gas.

subtly commented 8 years ago

What happens to gasprice?

coder5876 commented 8 years ago

@vbuterin ah, so ECRECOVER for secp256k1 would be at address 3. Any plans on precompiling it for other curves (secp256r1, NIST P256, ed25519 etc)?

coder5876 commented 8 years ago

Hehe, NIST P-256 and secp256r1 is the same curve, doh! :)

vbuterin commented 8 years ago

I think that a precompile for ed25519 is reasonable; all the altcoins seem to be converging on it as an optimal curve so we should consider it. I implemented it in python here https://github.com/vbuterin/ed25519 but I haven't made any effort in making sure that it's standards-compliant yet, though at least in python it seems like its speed advantages over secp256k1 exist but are quite a bit smaller than advertised.

alexvandesande commented 8 years ago

Small observation: while I think this is a good idea if recommend postponing it as much as we can, to allow token standards discussions to have real world usage and maturity.

On Nov 29, 2015, at 05:02, vbuterin notifications@github.com wrote:

I think that a precompile for ed25519 is reasonable; all the altcoins seem to be converging on it as an optimal curve so we should consider it. I implemented it in python here https://github.com/vbuterin/ed25519 but I haven't made any effort in making sure that it's standards-compliant yet, though at least in python it seems like its speed advantages over secp256k1 exist but are quite a bit smaller than advertised.

— Reply to this email directly or view it on GitHub.

coder5876 commented 8 years ago

@vbuterin Yeah Curve25519/Ed25519 are getting more and more popular. You should check out the NaCl library which is the canonical one for working with Curve25519 and Ed25519. I believe libnacl provides python bindings for this library.

The secp256r1 curve is interesting because it is a NIST standard and so curve signatures with this curve is supported on many off-the-shelf smartcards/USB keys like Yubikey. Also in iOS9 there is support for generating private keys and computing elliptic curve signatures using secp256r1 in the secure element of the iPhone, providing a very secure environment for mobile wallets.

subtly commented 8 years ago

@christianlundkvist I think NIST curves aren't very popular due to the limited evidence that their parameters are safe. See https://eprint.iacr.org/2014/571.pdf

coder5876 commented 8 years ago

@subtly I've never been able to figure out if there is some merit to the theory that the secp256r1 curve might be backdoored. It seems clear that Dual_EC_DRBG was indeed backdoored, but this RNG was immediately seen as suspicious and most people were reluctant to use it from the start. There are many inconclusive discussions such as this one

https://bitcointalk.org/index.php?topic=289795.200

which is mostly concerned with secp256k1. I guess that if you have a choice of other curves and there is a risk that it might be backdoored, you want to pick the other curve.

void4 commented 8 years ago

To extend on the code=data=state argument of @wanderer: Wouldn't it be possible to make the EVM a tree-addressed system instead of a pure stack machine? Is anyone familiar with Urbits Nock? I admit, it is a bit esoteric, but it would make certain processes easier (e.g. formal verification). It would be an extreme modification of the original spec, but I figured these changes are easier to do now than later.

wanderer commented 8 years ago

@viod4 yeah I have looked into Nock. Feel free to message me on gitter if you interested in VMs

AFDudley commented 8 years ago

This is a great idea, sooner rather than later, please!

janx commented 8 years ago

I have a concern about contract creation under this model. Currently, two contracts may have identical code but extremely different data, but in this case they would have to be the same contract. Think of a modern contract with the "owner" modifier, or the standard metacoin that gives the creator a zillion gizmos. Once I create an owned contract with a given code, no one else can make an identical contract with them as the owner. I don't think hardcoding the owner's signature is a good alternative, as then how do you change owners?

I got the same concern as @Smithgift , any comments on this? Does it mean I have to 'tweak' the contract code so it has a different hash before creating a contract with it?

Smithgift commented 8 years ago

@janx: In the latest iteration of this idea (see here on the main Ethereum blog), a contract address is the hash of code and the sender's address. There's still an issue if you want to have multiple contracts of the same code with different constructor arguments, but that's a smaller issue.

Smithgift commented 8 years ago

I'm concerned about EVM errors in this system. Suppose an attacker creates a valid transaction which, several function calls down the line, makes an invalid jump and so undoes the whole transaction. The miner has spent resources to compute the transaction, but since the transaction never happened, he doesn't get paid.

One "fix" would be to put a true try-catch mechanism in the EVM, and have the outermost contract catch all from inside, so it always pays. But the additional complexity of partial transaction reversion sounds unpleasant, to say the least.

janx commented 8 years ago

@Smithgift thanks, I missed that. Hash with sender address is good enough to me, since I can always include my own 'nonce' in contract to generate different address.

chriseth commented 8 years ago

@Smithgift the try-catch mechanism is already in place for the EVM. It does not have too much of an overhead because you can just switch back to a previously existing state root hash. Note that errors during execution do not revert the whole transaction but only the current call (Solidity has a mechanism for automatically causing an error in the outer stack frame in this case, but that is just a feature). You have to take care to clear deleted state trie nodes only at the end of the transaction and not while it is being executed. The code above:

# Make the sub-call and discard output
~call(msg.gas - 50000, ~calldataload(160), 192, ~calldatasize() - 192, 0, 0)

calls the actual code, but reserves 50000 gas for paying the miner. If the call runs out of gas, it returns (and puts an error code on the stack, which is ignored in this example) and we still have 50000 gas left to pay the miner.

Smithgift commented 8 years ago

@chriseth: Thanks. Learn something new every day.

EtherTyper commented 8 years ago

[BitNoCoin Proposal] Beat me to it, @vbuterin! Thanks for all of your work! I really appreciate all of your work on Ethereum, and my premine purchase is definitely worth it for your team's projects as well as the Ether!

coder5876 commented 8 years ago

Not sure if this belongs here, but is there interest in adding opcodes/precompiles for basic elliptic curve operations (EC addition, scalar multiplication etc)? I think you could do some fun on-chain crypto schemes like quasi-homomorphic encryption etc using this. I spoke to Denis Lukianov at DevCon and he mentioned that this may be on the roadmap?

wanderer commented 8 years ago

@christianlundkvist i'm more interested in making the VM fast enough that we don't need precompiles

chfast commented 8 years ago

+1 for static jumps only!

chfast commented 8 years ago

Is it possible to forbid dynamic jumps in the next hardfork?

Arachnid commented 8 years ago

What's the inspiration for the assembly pseudocode in the EIP? I've been fiddling with building a disassembler that generates easy to read code, and that seems like a good target!

Smithgift commented 8 years ago

@Arachnid: I believe that's actually Serpent code.

Arachnid commented 8 years ago

@Smithgift It doesn't look much like Serpent: https://github.com/ethereum/wiki/wiki/Serpent

Smithgift commented 8 years ago

@Arachnid: Check out here.

ethereum / EIPs