ethereum / EIPs

The Ethereum Improvement Proposal repository
https://eips.ethereum.org/
Creative Commons Zero v1.0 Universal
12.93k stars 5.3k forks source link

BLAKE2b `F` Compression Function Precompile #152

Closed tjade273 closed 2 years ago

tjade273 commented 8 years ago

Title

Title: Add RFC 7693 compression function `F` contract
Author: Tjaden Hess <tah83@cornell.edu>
Status: Draft
Type: Standard Track
Layer: Consensus (hard-fork)
Created 2016-10-04

Abstract

This EIP introduces a new precompiled contract which implements the compression function F used in the BLAKE2b cryptographic hashing algorithm, for the purpose of allowing interoperability between the Zcash blockchain and the EVM, and introducing more flexible cryptographic hash primitives to the EVM.

Parameters

Adds a precompile at address 0x0000....0d which accepts ABI encoded arguments corresponding to the function signature

F(bytes32[2] h, bytes32[4] m, uint t , bool f, uint rounds) returns (bytes32[2] h_new);

where h, m, t and f are the current state, the new message, the byte counter and a finalization flag, as defined in RFC 7693, and rounds is the number of rounds of mixing to perform (BLAKE2b uses 12, BLAKE2s uses 10). h_new is the updated state of the hash.

Each operation will cost GFROUND * rounds.

Motivation

Besides being a useful cryptographic hash function and SHA3 finalist, BLAKE2b allows for efficient verification of the Equihash PoW used in Zcash, making a BTC Relay - style SPV client possible on Ethereum. A single verification of an Equihash PoW verification requires 512 iterations of the hash function, making verification of Zcash block headers prohibitively expensive if a Solidity implementation of BLAKE2b is used.

The BLAKE2b algorithm is highly optimized for 64-bit CPUs, and is faster than MD5 on modern processors.

Interoperability with Zcash could enable contracts like trustless atomic swaps between the chains, which could provide a much needed aspect of privacy to the very public Ethereum blockchain.

Rationale

The most frequent concern with EIPs of this type is that the addition of specific functions at the protocol level is an infringement on Ethereum's "featureless" design. It is true that a more elegant solution to the issue is to simply improve the scalability characteristics of the network so as to make calculating functions requiring millions of gas practical for everyday use. In the meantime, however, I believe that certain operations are worth subsidising via precompiled contracts and there is significant precedent for this, most notably the inclusion of the SHA256 prcompiled contract, which is included largely to allow inter-operation with the Bitcoin blockchain.

Additionally, BLAKE2b is an excellent candidate for precompilation because of the extremely asymetric efficiency which it exhibits. BLAKE2b is heavily optimized for modern 64-bit CPUs, specifically utilizing 24 and 63-bit rotations to allow parallelism through SIMD instructions and little-endian arithmetic. These characteristics provide exceptional speed on native CPUs: 3.08 cycles per byte, or 1 gibibyte per second on an Intel i5.

In contrast, the big-endian 32 byte semantics of the EVM are not conducive to efficient implementation of BLAKE2, and thus the gas cost associated with computing the hash on the EVM is disproportionate to the true cost of computing the function natively.

Implementation of only the core F compression function allows substantial flexibility and extensibility while keeping changes at the protocol level to a minimum. This will allow functions like tree hashing, incremental hashing, and keyed, salted, and personalized hashing as well as variable length digests, none of which are currently available on the EVM.

There is very little risk of breaking backwards-compatibility with this EIP, the sole issue being if someone were to build a contract relying on the address at 0x000....0000d being empty. Te likelihood of this is low, and should specific instances arise, the address could be chosen to be any arbitrary value, with negligible risk of collision.

zookozcash commented 8 years ago

Great! I think this is a good feature to add. It will allow "Project Alchemy" — BTCRelay-style interoperation between the Zcash and Ethereum blockchains, and it may also be useful as a general purpose secure hash function which is more efficient than SHA3.

chriseth commented 8 years ago

Please add some more details about the gas costs. I guess they should at least depend on rounds.

zookozcash commented 8 years ago

In the earlier EIP about a precompiled full BLAKE2b, the following conversation happened about gas costs:

Okay, so for this new proposal — the precompiled BLAKE2b Compression Function F, the computational cost is not a base plus a cost per byte. Instead, every invocation of this precompile has the same size input (192 bytes, not counting t, f, or rounds).

I suggest the gas cost for invoking BLAKE2b Compression Function F should be 440 per invocation. That's from estimating that it will take about 440 microseconds to compute it. It is within 4X of what it would cost in gas to hash a 192-byte input with SHA3.

(I still don't understand what Zaki's rationale is for making the gas cost of BLAKE2b precompile be the same as gas cost of SHA3 precompile.)

zookozcash commented 8 years ago

Oh, good point about the gas cost of rounds. I propose that BLAKE2b Compression Function F gas cost be 37 gas per round. (12 rounds to be compatible with BLAKE2b.)

tjade273 commented 8 years ago

I think picking hard numbers right now would be somewhat pointless, until we can actually measure compute time in different clients. You're right that the gas is now going to be proportional to the number of rounds now, as opposed to the number of bytes.

gcolvin commented 8 years ago

I'd rather implement 64-bit operations in the EVM.

matthewdgreen commented 7 years ago

I think this would be a useful project, and I'd like to see it get restarted!

zookozcash commented 7 years ago

Okay, to make progress on this, I propose that we temporarily table @gcolvin's alternative suggestion of 64-bit arithmetic (interesting idea, but let's move it to a separate EIP), and satisfice on the gas costs (I still want the BLAKE2b Compression Function F to cost 37 gas per round, but if agreeing to any other number will get this project moving again, then I'll agree to that number!).

@tjade273, @vbuterin: What can we do to help make progress on this. Would it help if one of the Zcash team provided an implementation in some programming language?

gcolvin commented 7 years ago

Agreed.

tjade273 commented 7 years ago

Sorry if the project has slowed down a bit, I've been quite busy with school; I have a break coming up and I hope to get a PoC with a few of the implementations done.

Right now, I think the priority should be to get a precompile implemented in the Go and Rust implementations, since those are the most widely used. To that end, if anyone has experience with FFIs in Go or Rust, that'd be really helpful.

As for gas costs, I think it will become clear what a reasonable price is based on some simple benchmarking once we have an implementation in major clients

zookozcash commented 7 years ago

Great! @gcolvin agrees to move the 64-bit-arithmetic idea to another thread (and please let me know so I can subscribe to that one), and we all agree to defer decision about precise gas costs. I'll see if I know someone who wants to take a stab at implementing it in Rust.

tjade273 commented 7 years ago

The Go precompile is almost complete, and the benchmarks look good: the BLAKE2 precompile is almost exactly twice as fast as the SHA3 on my laptop. That's using a pretty un-optimized, pure-go implementation.

zmanian commented 7 years ago

@tjade273 So I was looking at this and I asked Veorq about it and it looks to me like you passing in 128 bits of counter data into the compression function. I believe that is supposed to be max 64 bits. It may not matter but it requires more invasively changing an existing Blake implementation to conform to your api like I'm doing with the Rust impl.

tjade273 commented 7 years ago

@zmanian The specification says that the compression function takes a

2w-bit offset counter "t"

i.e. 128 bits. This is consistent with the reference implementation.

The API is not set in stone, by the way. I'm totally open to alternative ones and I'm strongly considering getting rid of ABI encoding and just serializing the data in order.

zmanian commented 7 years ago

Yeah the spec says 2 words. There are some incidental complexities to storing the value in two state variables in the rust version. The single variable allows using checked arithmetic and typecast in token state update. i'll try and take a closer look at it.

tjade273 commented 7 years ago

@zmanian You may be safe just ignoring the higher-order word. You would have to be hashing ~10^10 GB of data for it to make any difference. It's hacky, but not the end of the world if it saves you a lot of trouble. We just need to make sure all of the implementations behave consistently.

zmanian commented 7 years ago

It really does save a lot of trouble. I mean nightly does have 128 bit checked arithmetic now...

I don't even @vbuterin has enough eth to so 10^10GB on the blockchain :-)

chriseth commented 7 years ago

@tjade273 What are your numbers on the gas costs of https://github.com/ConsenSys/Project-Alchemy/blob/master/contracts/BLAKE2b/BLAKE2b.sol ? It seems to me that this contract can be heavily optimized. For example the constant could be moved from storage to code / memory (50 gas some time ago, now 200 gas for sload compared to 3 gas for mload / codecopy). This could make it viable to do all that without a precompile.

tjade273 commented 7 years ago

@chriseth I haven't worked on the BLAKE contract since before the gas cost readjustments, so I'm not sure exactly what the costs are. When I last ran it, each run was something like 200k gas. I'm sure I could optimize it more, but the equihash PoW verification requires 2^9 BLAKE compressions, and I though it unlikely that I would be able to get the gas costs below ~6,000.

I can take another crack at the optimizations, though and see if I can get the costs significantly lower

zookozcash commented 7 years ago

Is there anything I can do to move this EIP along? I'd really like to reduce barriers to cross-chain interoperation between Ethereum and Zcash, and this would help. (There are other approaches to cross-chain interop, but it isn't yet clear to me that any of the alternatives would be strictly better than this one.)

Additionally, making an efficient implementation of BLAKE2 available to Ethereum contracts could help with completely unrelated uses, such as if an Ethereum contract wanted to somehow interoperate with other systems like https://github.com/bramcohen/MerkleSet, or just because BLAKE2 is a highly efficient and secure hash function.

I see that the "editor-needs-to-review" label was added. Who is the editor? Thanks!

MicahZoltu commented 7 years ago

@zookozcash I believe your best bet is probably to create a PR of this EIP. Issues are intended for discussion, which it seems there isn't much of at this point. Once there is no longer any "debate" then a PR that adds the EIP using the template is the next step in the process.

I believe the contributing guide was recently updated: https://github.com/ethereum/EIPs#contributing

Of course, I could be totally wrong about all this as I'm just a random guy on the internet with no authority, but if I were you the above is what I would do. :)

zookozcash commented 7 years ago

By the way, there are two different motivations for why this EIP should go into Ethereum. The first (dear to my heart) is to facilitate Zcash↔Ethereum interoperation using the "BTCRelay" technique. Basically allowing Ethereum contracts to "see into" the Zcash blockchain! This is already possible, but it is very slow and expensive without this precompile.

But I want to emphasize that there is another independent reason for this EIP, which is that BLAKE2 is just a more efficient and more widely used hash function than SHA-3, and this EIP could allow Ethereum contracts to efficiently interoperate with data generated by the growing number of BLAKE2-using data generators out there.

I was reminded of this latter fact when reading this blog post by the estimable Adam Langley “Maybe Skip SHA-3” and the ensuing discussions on hackernews and reddit.

There appears to be a growing consensus among cryptographers that SHA-3 is not a great fit for most purposes and is not likely to become widely used. Even the authors of SHA-3 seem to be advocating not for adoption of SHA-3, but for adoption of newer, more efficient variants of SHA-3.

But everyone agrees that BLAKE2 is efficient, secure, and widely used.

By the way, you might enjoy this nice history of hash functions:

https://z.cash/technology/history-of-hash-function-attacks.html screenshot 2017-06-03 at 19 30 50

vbuterin commented 7 years ago

Thanks a lot for the great feedback. This is definitely making me consider BLAKE2 support much more strongly.

whyrusleeping commented 7 years ago

The ipfs team is working towards switching our default hash function to blake2b-256, on the grounds of making future integrations much easier, i'm definitely in support of this

axic commented 7 years ago

@tjade273 if it is already ABI encoded, why not have two functions: hash, hash_final? That would remove the need for the bool f (F(bytes32[2] h, bytes32[4] m, uint t , bool f, uint rounds) returns (bytes32[2] h_new);).

It may also make sense considering to make the rounds part of the name too.

I assume the code using this would either go for Blake2b or Blake2s and wouldn't really switch between them, it would mean cheaper execution as the number of parameters for ABI encoding is lower and uses less memory as well. The downside is of course if the rounds parameter is actually free to be anything based on the use case.

holiman commented 7 years ago

I'm tentatively in favour of this, but I'm not so sure about having a precompile accept ABI-encoded data. Reasons being:

Furthermore, ABI-encoding is imo not fool-proof. It's possible to e.g. point two dynamic arrays to use the same data, or point the data-part to wrap around back to the start of the argument-field.

TLDR; I'd rather not have ABI-parsing as part of consensus.

axic commented 7 years ago

@holiman I think it is certainly possible defining a restricted set of ABI encoding supported by this precompile the same way as the data format is described for every other precompile.

holiman commented 7 years ago

restricted set sounds good

tjade273 commented 7 years ago

I'm actually not a huge fan of ABI encoding in this context either. It makes interaction with the precompile easier, but I'm not sure it is necessary and it does add overhead.

t, f, and rounds can all fit into a single 32 byte value as well, so using three separate uints is excessive.

zookozcash commented 7 years ago

I don't have a strong opinion about the encoding issues but I'm persuaded by https://github.com/ethereum/EIPs/issues/152#issuecomment-331853965 and https://github.com/ethereum/EIPs/issues/152#issuecomment-331967132 that we should just use a specified packing.

The next step is to write an EIP! (https://github.com/ethereum/EIPs/issues/152#issuecomment-295087326)

jasondavies commented 6 years ago

I used wasm-metering to measure the gas cost of a blake2b compress invocation when using blake2b-wasm, and it only took ~31.2 gas per compress invocation (much lower than I had expected based on the discussions above).

My branch of blake2b-wasm is here; it simply logs to the console the cost of every compress invocation, which should be the same every time.

axic commented 6 years ago

@jasondavies note, the ewasm gas costs we have aren't final yet

axic commented 6 years ago

@jasondavies also thank you for reporting this, will need to look into it. It would be nice if you could join https://gitter.im/ewasm/Lobby

zookozcash commented 6 years ago

Here's another implementation of BLAKE2 for wasm: https://github.com/nazar-pc/blake2.wasm

sjalq commented 6 years ago

We at Octobase.co would like to see this implememted.

axic commented 6 years ago

We have a work in progress implementation for the ewasm (testnet) at https://github.com/ewasm/ewasm-precompiles/pull/15

It doesn't fully model this EIP yet, but any contribution is welcome to do it.

It will be available and fully usable in the ewasm testnet, to be launched at devcon 😉

Souptacular commented 5 years ago

Please see discussion on https://github.com/ethereum/EIPs/pull/131 where I will direct people to a FEM post.

daira commented 5 years ago

While support for the F function does technically allow personalization, output length restriction, etc., I worry about the usability and security assurance consequences of this approach. Contract implementors will have to implement a significant part of the BLAKE2b spec themselves (with potential gotchas including byte order of counters, correctly handling the finalization flag, etc.), on a VM that is not ideally designed for byte operations, and will have to convince themselves that they have a secure, interoperable implementation.

This is not what Ethereum does for SHA-256 or Keccak-256. I would prefer to see a proposal that additionally supports a simplified API that only includes personalization and output length parameters (which are required for interoperability with Zcash, and for many other applications). Tree hashing, keyed operation, etc., are much less commonly used features in practice.

zookozcash commented 5 years ago

I agree with you that there is a lot of risk in implementing those parts, and that we should provide a full implementation to coders so that they are not implementing any of that. However, why not have the implementation that we provide to coders be written in solidity, using the precompiled compression functions for efficiency, instead of being built into the EVM?

MicahZoltu commented 5 years ago

Any function that is called with sufficient frequency on Ethereum (across all users) benefits from being a built-in function. The more generalized the function (e.g., industry standard hashing functions), the more likely it is to meet that threshold and be worth implementing natively.

That being said, if possible it is generally better to start with solidity and then if the function is used frequently enough then lobby for a move to native. This strategy has the advantage of having data to back up. Your claim that the function is "frequently used".

The solidity first strategy doesn't work if the function is so expensive that useful contracts can't be built without a native function being added to support it, which I believe was the case for Blake2 based proofs historically.

PlainsWraith commented 5 years ago

I figure rather than implement new blake protocols in GO, we just use the functions that the golang folks made and if we want to make more efficient implementations, we can help everybody by contributing to their package. They've responded to change requests from ethereum developers before.

After Daria's detailed input I tried wrapping golang blake2b and blake2s as precompiled functions here. I don't know yet how to calculate gas for these functions and tests have to be written in contracts_test.go. What do you guys think about this approach?

mhluongo commented 5 years ago

we just use the functions that the golang folks

That's been our approach as well, though AFAIK that will only help for wrapping the hash functions.

@daira @zookozcash it'd be helpful to know whether you have other use cases in mind, or whether just shipping blake2b/s without direct access to the underlying compression function would suffice.

PlainsWraith commented 5 years ago

I think the ultimate end-state objective is to have something like BTC-relay but for zcash so folks can pay for dApp services with Zcash.

zookozcash commented 5 years ago

Comment re-posted below with fixed formatting: https://github.com/ethereum/EIPs/issues/152#issuecomment-499236824

PlainsWraith commented 5 years ago

for reference: tjade's F precompile only implementation

mhluongo commented 5 years ago

I think the ultimate end-state objective is to have something like BTC-relay but for zcash so folks can pay for dApp services with Zcash.

That's why I asked for "other uses"- I presume the bridge is the top priority.

We're in "perfect is the enemy of good" territory. The path to one-shot hash precompiles is very straightforward, the path to an updateable precompile less so, and the idea of a Solidity implementation around an F precompile is rough- the precompile implementation is simple but it's a footgun.

If we can clearly delineate what's necessary for a bridge versus a nice-to-have it'd be helpful. Less is more here IMO.

zookozcash commented 5 years ago

This is a comment that I originally made above but since I made it through email, github destroyed the formatting, rendering unreadable. It turns out that it isn't possible to fix the formatting in that comment since github remembers that it is tainted by email and will not allow formatting. Therefore I'm reposting it here. Lesson learned: never use email for anything.

@daira @zookozcash it'd be helpful to know whether you have other use cases in mind, or whether just shipping blake2b/s without direcy access to the compression function would suffice.

It would suffice for BTCRelay-style Zcash integration. I don't know if it would suffice for the other goals that Virgil mentioned: IPFS/Filecoin integration and Handshake integration.

There are other uses (current and potential future uses) of BLAKE2 for which it would not suffice, as I mentioned:

  1. the BLAKE2?p variants: parallel hashing for greater throughput — already in use in some projects.

  2. the BLAKE2??x variants: extended output—already defined but I don't know if it is in use.

  3. reduced or increased rounds: not currently in use but it is quite possible that in the coming years reduced-round variants may come into use, because that might be necessary to meet performance constraints that the full hash function (and other full hash functions like SHA2 and SHA3) can't meet.

To recap the pros and cons of "full hash in precompile" vs "compression function F in precompile" vs "both":

a. If EVM supplies full hash, this increases the work for EVM implementers and increases the attack surface in EVM.

b. If EVM doesn't supply compression function F, this prevents smart contracts from being able to compute the three variants described above.

c. If EVM doesn't supply the full hash, this means users may implement the full hash themselves using the compression function F precompile, increasing the work for them and the attack surface in their code. (This was Daira's point above.) Perhaps this can be mitigated by providing a reusable, audited Solidity implementation of the full hash (which uses the EVM-supplied compression function F precompile) and putting prominent documentation in the compression function F precompile warning users not to implement their own full hash but to use the standard Solidity implementation of the full hash.

Also note that some users are going to want/need an incremental-update API and other users are going to want a one-shot API. Both APIs are going to have to be provided either in the EVM, in the standard solidity implementation, or by the users themselves.

IMHO a good approach is put only the compression function F precompiles (for BLAKE2b and BLAKE2s) into the EVM and then put the full hashes — including the incremental-update vs one-shot APIs and any variants that turn out to be needed — in a solidity package.

I'd like to know that Daira thinks of this proposal! She has excellent knowledge and taste about this sort of architecture.

Regards,

Zooko

zookozcash commented 5 years ago

We're in "perfect is the enemy of good" territory. The path to one-shot hash precompiles is very straightforward, the path to an updateable precompile less so, and the idea of a Solidity implementation around an F precompile is rough- the implementation is simple but it's a footgun.

For what it is worth, I disagree. I think implementing the Compression Function F precompile in EVM is the simplest, easiest, and safest thing to do before the Istanbul deadline, and that there is minimal potential downside to doing so.

Note that a lot of details, including the API and gas costs for this approach have already been worked out up-thread, e.g. https://github.com/ethereum/EIPs/issues/152#issuecomment-342646324

If we can clearly delineate what's necessary for a bridge versus a nice-to-have it'd be helpful. Less is more here IMO.

The minimal thing that is necessary for a working ZEC-ETH relay is an implementation of BLAKE2b Compression F in a precompile.

A BLAKE2b Compression Function F precompile would also suffice for the Filecoin and Handshake interop goals.

A full BLAKE2b precompile would suffice for a ZEC-ETH relay, provided that the implementation provided the parts of the BLAKE2 API that we need (personalization, maybe something else—I'm not sure).

I'm not 100% certain if a full BLAKE2b precompile would also suffice for the Filecoin and Handshake goals. It almost certainly could, provided that it supports all the API that they need.

BLAKE2s — whether the Compression Function F or the full hash — is only a nice-to-have for the purposes of a ZEC-ETH relay.

I hope that answers your question, Matt! Please ask if there's anything else I can answer.

MadeofTin commented 5 years ago

The minimal thing that is necessary for a working ZEC-ETH bridge is an implementation of BLAKE2b Compression F in a precompile.

A full BLAKE2b precompile would suffice for a ZEC-ETH gateway, provided that the implementation provided the parts of the BLAKE2 API that we need (personalization, maybe something else—I'm not sure).

What is the difference between a bridge and a gateway in this context?

Therefore I'm reposting it here. Lesson learned: never use email for anything.

Good to know. Hahaha :)

zookozcash commented 5 years ago

What is the difference between a bridge and a gateway in this context?

I didn't intend to indicate two different things—I just accidentally used two different words. I just went back and edited my comment to use the word "relay" consistently instead, since the envisioned use is a smart contract that can inspect and verify Zcash transactions in the style of BTCRelay.