paritytech / substrate

Substrate: The platform for blockchain innovators
Apache License 2.0
8.39k stars 2.65k forks source link

SURI HD derivation confuses `ChainCode` with index `i` #3396

Open maciejhirsz opened 4 years ago

maciejhirsz commented 4 years ago

Disclaimer: I'm not a cryptographer, talked to @burdges about it to make sure I'm reading stuff right. Jeff can fill in details if I missed anything.

Current sr25519 derivation code feeds DeriveJunctions as ChainCodes to soft/hard derivation methods, using empty byte arrays for i in both cases. This seems to be fine for what we are doing (both ChainCode and i end up in the same hash anyway), but is not the intended use of the API.

The consequences are that we are missing the extra entropy from the ChainCode product of previous key expansion in subsequent derivations. This is probably fine as we still have complete entropy of the original secret key, and my understanding after talking to Jeff as of why the extra entropy exists in BIP32 in the first place is that nobody knows :P.

TL;DR: We are feeding derivation junctions as chain codes instead of i, and throwing away chain codes from previous expansions, which is probably fine.

burdges commented 4 years ago

Right i should come from the derivation path, not the ChainCode.

I'll first note that BIP32 is ambiguous about the ChainCode's real function, only that it "prevent [derived keys] from depending solely on the [parent key]".

It's quite unambiguous that ChainCodes should be machine generated bytes though, meaning they should not be user controlled parts of the derivation path.

I'm not so worried about lacking the extra entropy from the ChainCode. In fact, our starting secret key has 252 bits of entropy, more than enough. In theory, it only has 128 bits of security since the public key is known, but again enough. I should probably decided if the chain code should be optional or not too.

Also, we take system randomness when creating derived nonces in https://github.com/w3f/schnorrkel/blob/master/src/derive.rs#L185 although this does not work when doing hard derivations since we do not pass them a RngCore.

In fact, I think ChainCode should really be viewed as de facto hard derivations that can be turned into soft derivations later by sharing the ChainCode. Is this functionality useful? I donno, but maybe by someone somewhere.

We cannot simply fix this issue in kusama because people already have kusama keys in which their i value gets filled into the chaincode's merlin input slot. We also cannot reverse the merlin input slots because merlin has good domain separation.

In principle, we could do a coordinated upgrade between schnorrkel and substrate in which schnorrkel adopts incorrect looking domain separation, i.e.

t.commit_bytes(b"chain-code",&cc.0);
t.commit_point(b"public-key",self.as_compressed());

becomes

t.commit_bytes(b"chain-code",self.as_compressed());
t.commit_point(b"public-key",&cc.0);

If we do not do this then we need some mechanism to prevent users from using the keys from kusama derivation paths in polkadot, like by making their polkadot key sign itself.

burdges commented 4 years ago

I suppose the cleanest solution is to have a mode for keys, maybe like kusama, polkadot, etc.