ethereum / EIPs

The Ethereum Improvement Proposal repository
https://eips.ethereum.org/
Creative Commons Zero v1.0 Universal
12.83k stars 5.24k forks source link

Draft EIP: BLS12-381 Key Generation #2337

Closed CarlBeek closed 2 years ago

CarlBeek commented 4 years ago

Discussion for Draft EIP-XXXX: BLS12-381 Key Generation #2333 .

This is the proposed standard for BLS12-381 Key Generation for use within Eth2 as well as by the larger blockchain industry. Please be considerate of the implications of this standard in both the Eth2 and wider contexts.

MrChico commented 4 years ago

Is there a rationale for having such wide trees, with 2^256 possible children? The BIP32 tree has a mere width of 2^32.

CarlBeek commented 4 years ago

Thanks for the review.

The primary reason for the tree width is my brain seems to think in terms of Eth2 uints now. :/ I guess it is more of a why not scenario. salt in HKDF accepts 32 bytes due to the underlying SHA256 hash function and therefore using less just seems to be a waste (it will be padded with zeros up to 32 bytes anyway).

The main reason I can think of for BIP32 use of narrow trees is that the index is encoded in extended public and private keys and therefore large indices would waste space. These key types are of no use in this standard as a parent's private key and child index is necessary and sufficient to calculate a child key.

mcdee commented 4 years ago

Using 256-bit values does start to cause issues in and around the ecosystem. Attempting to pass around large integers in JSON, for example, will cause problems. RFC 7159 states:

Since software that implements IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is generally available and widely used, good interoperability can be achieved by implementations that expect no more precision or range than these provide, in the sense that implementations will approximate JSON numbers within the expected precision.

Given this, and with the lack of any real use case beyond 2^32 individual accounts per seed, I'd be inclined to dial index back to be a 32-bit unsigned integer.

mcdee commented 4 years ago

Should seed have a constraint on its size, specifically should be > 0 bytes? Although an empty seed is valid it may well point to upstream issues in supplying the value, so making it invalid here would protect users against fault software elsewhere.

CarlBeek commented 4 years ago

Tree indices are now uint32s to guarantee compatibility with languages with limited support for larger numbers natively.

Should seed have a constraint on its size, specifically should be > 0 bytes?

I think a better limitation on seed is > 128 bits as anything smaller seems unsafe and it may catch implementation bugs whereby the seed is prematurely truncated. If the source of the seed is less than 128 bits, it can be extended via a PRF such as HKDF.

mcdee commented 4 years ago

I think a better limitation on seed is > 128 bits

That works. My rationale for what was basically "not null" was to avoid bad implementations; this does it better so thumbs up from me.

mcdee commented 4 years ago

Tree indices are now uint32s to guarantee compatibility with languages with limited support for larger numbers natively.

Just looked at the EIP and it seems the commit changed the index to a maximum of 2^64 not 2^32...

kirk-baird commented 4 years ago

Should we update to match bls-signatures-02?

Currently hkdf_mod_r() matches bls-signatures-00 as

1. PRK = HKDF-Extract("BLS-SIG-KEYGEN-SALT-", IKM)
2. OKM = HKDF-Expand(PRK, "", L)
3. SK = OS2IP(OKM) mod r
4. return SK

However, there has been updates in bls-signatures-02 to

1. PRK = HKDF-Extract("BLS-SIG-KEYGEN-SALT-", IKM || I2OSP(0, 1))
2. OKM = HKDF-Expand(PRK, key_info || I2OSP(L, 2), L)
3. SK = OS2IP(OKM) mod r
4. return SK
mratsim commented 4 years ago

One thing that is unclear is the inclusive vs exclusive for-loop.

For example we have

* `L = K * 255` is the HKDF output size (in octets)

Assuming indexing from 0, we would iterate through slot 0 to 254 (inclusive)

However below we have

6. for i = 0 to 255
       lamport_PK = lamport_PK | SHA256(lamport_0[i])
7. for i = 0 to 255
       lamport_PK = lamport_PK | SHA256(lamport_1[i])

So I assume it's 255 exclusive

Looking into the IETF specs, the HKDF spec uses inclusive range with non explicit for loop https://tools.ietf.org/html/rfc5869

   The output OKM is calculated as follows:

   N = ceil(L/HashLen)
   T = T(1) | T(2) | T(3) | ... | T(N)
   OKM = first L octets of T

And the BLS signature spec uses for 1, ..., N inclusive, https://tools.ietf.org/html/draft-irtf-cfrg-bls-signature-02

   5.  for i in 1, ..., n:
   6.      If KeyValidate(PK_i) is INVALID, return INVALID
   7.      xP = pubkey_to_point(PK_i)
   8.      Q = hash_to_point(message_i)
   9.      C1 = C1 * pairing(Q, xP)
mratsim commented 4 years ago

Another thing that caught my eye is the master key generation from a seed: image This is 128-bit is 16 bytes of entropy

The BLS spec requires 32 bytes: https://tools.ietf.org/html/draft-irtf-cfrg-bls-signature-02#section-2.3 image

Now the usage is different since the master key is not supposed to be used as-is be re-expanded and 16 bytes may be enough (cryptographer input needed) but in that case, a line that acknowledges that this was considered would be nice.

Alternatively we can just use 32 bytes like the BLS spec.

CarlBeek commented 4 years ago

@mratsim, thanks for raising these issues. I've clarified the indexing in the docs. The idea is that the arrays of octet-strings are flattened into a combined octet-string by concatenation.

On the topic of seed length, I made that decision based because BLS12-381 has 128 bit security. In this case it is safe to use that amount of input entropy as the seed is passed through (several) SHA256 hashes. The security only goes birthday on the other side of the hash. That said, the design principle of reusing the KeyGen from the BLS specs still holds, so I have updated the requirements to >=256 bit.

CarlBeek commented 4 years ago

Breaking Change:

HKDF_mod_r now reflects the new definition of KeyGen from the BLS v02 specs.

kristieguo commented 4 years ago

@CarlBeek @mcdee @mratsim @MrChico @kirk-baird Hello,

I'm doing some research and preparing to support ETH2 staking in our custody product. However, I read the EIP proposals and have questions about BLS key derivation implementation.

The EIP2333 proposal states:

"Based on the order of the private key subgroup of BLS12-381 and the size of the entropy utilised, more than 54% of keys generated by BIP32 would be invalid."

I did some tests and the result matches the statement. However, currently our custody system is using BIP44 to derive keys and we want to keep it that way. By reading the entire proposal, we understand that a new key derivation method involving intermediate Lamport algorithm is recommended but we could not find a Javascript SDK for the entire derivation process. We don't want to take the risk of writing the derivation by ourselves and leads to possible token assets loss. So I was wondering what exact reason is leading to the percentage of 54% invalid key generation and how we could solve this problem without abandoning BIP. By referring to the BLS Javascript SDK as in https://github.com/herumi/bls-eth-wasm, I came up with the solution and demo code as below:

  1. derive from seed by BIP
  2. set derived key in step1 to little endian, make it as the BLS private key
  3. get BLS public key by the private key
demo-code

Although I see at least one problem in this solution. Keys generated in step 2 may be the same after being set to little endian even if the original keys generated in step 1 are different. Despite this problem, Could this solution 100% solve the BIP key generation problem mentioned in the EIP proposal? Or does anyone have better solutions for us to continue to use BIP44 for key generation without implementing another derivation algorithm?

I really appreciate it if anyone could help us. Thank you.

mratsim commented 4 years ago

By reading the entire proposal, we understand that a new key derivation method involving intermediate Lamport algorithm is recommended but we could not find a Javascript SDK for the entire derivation process. We don't want to take the risk of writing the derivation by ourselves and leads to possible token assets loss.

I expect the Lodestar team will have an implementation that is Javascript-focused and that will undergo security audit at one point.

I can't comment on BIP44, note that there is no demo code in attachment.

MicahZoltu commented 2 years ago

Since this EIP has become stagnant, we are closing this discussion issue for housekeeping purposes. You may continue discussion here on the closed issue (it won't be locked), but if you want to revive the EIP (bring it back to Draft or Review status), you will need to create a thread over at https://ethereum-magicians.org/ and update the discussions-to link in the EIP to point there instead (that is where all discussions happen now).

xinbenlv commented 1 year ago

@CarlBeek, hi from 2022. this is an important EIP. Thank you for your contribution for drafting this EIP.

Are you still pursuing finalizing it? I encourage that you continue to pursuing finalizing it, or invite someone to finalize it if you don't have capacity.

Pandapip1 commented 1 year ago

Considering this EIP is stagnant, the answer is no. However, since the EIP has been stagnant for a while, you can feel free to add yourself as an author and move it into Draft or Review, and if @CarlBeek doesn't respond for a while we'll manually merge it.

mratsim commented 1 year ago

Isn't it already finalized? It's necessary for all validators to use a library that implements this EIP.

Pandapip1 commented 1 year ago

This is not finalized. Feel free to move it back into draft.