Refactor key derivation

hashgraph / hedera-sdk-reference

Hedera SDK specification repository.

Apache License 2.0

5 stars 1 forks source link

Refactor key derivation #73

Open litt3 opened 1 year ago

litt3 commented 1 year ago

There are problems with how key derivations are currently implemented. This issue is for discussing the best solution to the existing problems, and settling on a course of action.

Background

Proposal (will be updated as discussion occurs)

Converting Mnemonics to Private Keys

Mnemonic.toStandardEd25519PrivateKey(passphrase="", index=0)
- returns the child key at derivation path m/44'/3030'/0'/0'/index'
- this function automatically hardens index. Passing in a pre-hardened index should fail
Mnemonic.toStandardECDSAsecp256k1PrivateKey(passphrase="", index=0)
- returns the child key at derivation path m/44'/3030'/0'/0/index
- if the user wants a hardened child, they must manually harden index
Mnemonic.toLegacyPrivateKey() returns an Ed25519 private key
- this function should choose the type of legacy derivation based on mnemonic word count / composition
  - 22 words from the non-standard list means legacy derivation v1
  - 24 words from the standard list means legacy derivation v2

Deriving Child Keys

PrivateKey.derive(index)
- returns the child key at index
- if PrivateKey is an Ed25519 key, index is automatically hardened. passing in a pre-hardened index should fail
- if PrivateKey is an ECDSA key, index must be manually hardened, if desired
PrivateKey.legacyDerive(index)
- returns the child key at index, using the legacy algorithm
- only valid for Ed25519 keys. Calling this on an ECDSA key should fail

Deprecated Functions (including but not necessarily limited to)

Mnemonic.toEd25519PrivateKey(passphrase = "", path = ...)
- since using the default path returns a key that shouldn't be used
Mnemonic.toECDSAPrivateKey(passphrase = "", path = ...)
- since the default path is entirely wrong, and returns a non-standard key (JS only)

Tests

All test vectors described in SLIP10, ~BIP32~, and BIP39 should be checked for all key types
- Some refactoring will be necessary to support test vectors from seed
- BIP-32 test requirement removed for now, since base58 extended key encoding isn't supported
The following test vectors should be created and standardized across SDKs
- legacyMnemonicV1
  - mnemonic to key
  - legacy derivation (if applicable, see question # 1 below)
- legacyMnemonicV2
  - mnemonic to key
  - legacy derivation (if applicable, see question # 1 below)
- toStandardEd25519PrivateKey
  - mnemonic to key with no passphrase and index 0
  - mnemonic to key with passphrase and index 0
  - mnemonic to key with passphrase and index max
- toStandardECDSAsecp256k1PrivateKey
  - mnemonic to key with no passphrase and index 0
  - mnemonic to key with no passphrase and index 0'
  - mnemonic to key with passphrase and index 0
  - mnemonic to key with passphrase and index 0'
  - mnemonic to key with passphrase and index max
  - mnemonic to key with passphrase and index max'

Possible future functionality (not included in the current proposal)

Functions that allow the user to explicitly define a full derivation path
- The functions that do this are being deprecated
- Since this is presumably not a commonly used feature, it doesn't need to be added urgently
Public key derivation
- inherently not possible with Ed25519
- should eventually be implemented for ECDSA

Questions

When is PrivateKey.legacyDerive(index) actually needed? Is this just for one of the legacy versions, or both?
- this needs to be figured out and documented
Can we remove the isLegacy flag from Mnemonic?
- There currently exists some functionality to construct a mnemonic with an isLegacy flag, or to infer isLegacy based on word count
- I personally don't like this at all. Not all SDKs do this the same way, and isLegacy actually seems to mean "is legacy v1", since legacy v2 mnemonics are notably not isLegacy
- IMO we should rip this confusing bandaid off now, and just require the wallet implementor to explicitly call toLegacyPrivateKey, rather than inconsistently and imperfectly inferring isLegacy
Should we specify the curve in the ECDSA function names? (see below for details)

litt3 commented 1 year ago

@mehcode Yes, JS not allowing function overrides makes this harder. I wasn't aware of that limitation.

Proposal v2

Implement new functions toStandard[Ed25519|Ecdsa]PrivateKey(passphrase="", index=0), such that the returned key is at derivation path m/44'/3030'/0'/0'/index' for Ed25519 keys, and m/44'/3030'/0'/0/index for ECDSA keys.
- these functions should be what developers use moving forward
- note that the Ed25519 version of this function automatically hardens all indices. passing in a pre-hardened index should cause an error
- we can haggle over the naming of this new function
- since this is an entirely new function, we can have a default index :)
Deprecate the existing functions to[Ed25519|Ecdsa]PrivateKey(passphrase = "", path = ...)
- do not change the implementation at all, so that existing usages continue working

Meets criteria

Existing code will work exactly like before
Methods will exist that allow recovering wallets that were incorrectly generated
Developers moving forward will no longer be able to accidentally retrieve keys at non-standard derivation paths

Downsides

The proposal doesn't include a non-deprecated function that allows for defining explicit derivation paths.
- This could be solved by adding another function, e.g. to[Ed25519|Ecdsa]PrivateKeyAtPath(passphrase, path) or similar
- The necessity of this addition can be evaluated, or that decision can be postponed

Other considerations

Should any changes be made to how we're handling the legacy key derivation?
- ~My default preference would be to have a different mnemonic type for each legacy/standard derivation strategy, and each mnemonic would know what types of keys it is able to generate. But I don't know enough about JS program design in particular to know if that makes sense here~
- See below for my first pass attempt at a new standard

Thoughts?

ochikov commented 1 year ago

@alittley I've reviewed your proposition. We were discussing at the issue as well that JS does not allow overrides. Some notes here:

Defining explicit derivation paths can be handled by the deprecated method to[Ed25519|Ecdsa]PrivateKeyAtPath(passphrase, path) (for existing ones)
New explicit derivation paths can be handled by the toStandard[Ed25519|Ecdsa]PrivateKey(passphrase="", index=0) method.
Can we elaborate more on legacy key derivation? Currently in JS we have PrivateKey.legacyDerive(index)

litt3 commented 1 year ago

@ochikov Concerning explicit paths: toStandard[Ed25519|Ecdsa]PrivateKey(passphrase="", index=0) cannot support them as written above, since it is only accepting a single index. A full fledged derivation path would have to accept a path

Now, concerning legacy key derivation: there are subtle differences in how the SDKs handle this currently. Rather than try to detail all the differences, I'm just going to start a proposal for how it can be standardized.

Function mnemonic.toLegacyPrivateKey() returns an Ed25519 key
- this function should choose the type of legacy derivation based on mnemonic word count / composition
  - 22 words from the non-standard list means legacy derivation v1
  - 24 words from the standard list means legacy derivation v2
- There currently exists some functionality to construct a mnemonic with an isLegacy flag, or to infer isLegacy based on word count
  - I don't like this at all. Not all SDKs do this the same way, and isLegacy actually means "is legacy v1", since legacy v2 mnemonics are notably not isLegacy
  - IMO we should rip this bandaid off now, and just require the wallet implementor to explicitly call toLegacyPrivateKey, rather than inconsistently and imperfectly inferring isLegacy. I am in favor of deprecating all functions that accept an isLegacy argument, and removing the isLegacy member from mnemonic classes
Function PrivateKey.legacyDerive(index) returns the child key at index
- This is only valid for Ed25519 keys
- TBH I have no idea the when exactly this is supposed to be used. Is the existing legacyDerive function valid for v1 legacy, v2 legacy, or both? It's not documented as far as I can tell.

Some other things to do with legacy keys that need addressing:

There need to be consistent, well labeled test vectors, common between all SDKs
- MNEMONIC3_STRING is an unacceptable name. There should be legacyV1MnemonicString, and legacyV2MnemonicString, or similarly descriptive names

litt3 commented 1 year ago

Before this goes into effect, we will need to resolve the ECDSA naming discussion. Should we specify the secp256k1 curve in newly created functions, or stick with the existing ambiguous naming?

Argument for excluding the curve from new function names:

all existing function/variable names simply say ECDSA without specifying curve

Argument for including curve in new function names:

the protobufs already specify curve. this ambiguity is a problem with the SDKs specifically
new ECDSA curves may be added in the future, which would make everything named simply ECDSA problematic

petreze commented 1 year ago

@alittley During the implementation of the test vectors for SLIP-10, BIP-39 and BIP-32, we've discovered that we do not support the generation of the xpub and xprv keys menitoned here inside the BIP-32 specs. As far as I understand, BIP-32 is basically the standard upon which we derive ECDSA child keys for HD(Hierarchical Deterministic) wallets based on the BIP-39 standard? Can you verify if this is true?

litt3 commented 1 year ago

BIP-39 defines how to generate mnemonic phrases, and how to turn these into seeds
BIP-32 defines how to turn a seed into a master key, and how to derive child keys (ECDSA_secp256k1)
SLIP-10 extends BIP-32 to also work with Ed25519
BIP-44 defines the structure of HD wallets, compatible with BIP-32 and SLIP-10

I hadn't noticed that the BIP-32 test vectors relied on an encoding we don't have 🤔 I think it's ok for now that the BIP-32 test vectors aren't implemented, since SLIP10 provides alternative vectors for Ed25519 and ECDSA_secp256k1, which don't rely on that extended key encoding.

ochikov commented 1 year ago

@petreze @alittley Guys, do you think that we can start implementing the refactoring soon? What about the curve into the name of the functions?

litt3 commented 1 year ago

@ochikov I see the following things that must be completed:

Resolution of the 3 questions listed in the issue body
Acknowledgement from someone familiar with each SDK that the proposal is sensible, and that there are no outstanding concerns (perhaps we already have this?)
Definition of additional common test vectors
- I edited the issue body to explicitly call out which additional vectors I think are necessary. These are needed, since the BIP and SLIP test vectors don't take our specific derivation path into account

SimiHunjan commented 1 year ago

1/10/2023

Add new name convention for ECDSA key functions to be consistent for all SDKs ex: ECDSAk256 or ECDSAsecp256k1 (to be voted on)
Deprecate old ones and point to the new ones [Future]

1/23/2023 Majority vote: ECDSAsecp256k1 for naming convention

ochikov commented 1 year ago

Answer to the questions from the first post:

PrivateKey.legacyDerived is used for both v1 and v2 legacy derivation. There are tests in all of the SDKs (Java, JS, Go) that use the method. In Java they are:
- MnemonicTest.thirdMnemonicTest: creates a 24-word mnemonic; creates legacy private key (it is v2 because the mnemonic has 24 words); derives 2 keys using legacyDerive()
- MnemonicTest.legacyMnemonicTest: creates a 22-word mnemonic; creates legacy private key (it is v1 because the mnemonic has 22 words); derives 2 keys using legacyDerive()
- MnemonicTest.myHbarWallerV1Test: same as legacyMnemonicTest but uses different derivation index
I consider that isLegacy can be removed

Additional questions:

Should we change ECDSA to ECDSAsecp256k1 only in the new function toStandardECDSAsecp256k1PrivateKey or change all instances of ECDSA(methods and classes)?
Do we need to create a toHardenedIndex() method or do we expect the users to harden the indexes themselves if they need to? toStandardECDSAsecp256k1PrivateKey("", 2147483648) vs toStandardECDSAsecp256k1PrivateKey("", toHardenedIndex(0) . If it is needed, should it be in the PrivateKeyECDSA, the Mnemonic or a Utils class?
Should we change Mnemonic.toLegacyPrivateKey? In Java it uses the number of words to determine if the private key should be v1 or v2. In JS it uses the isLegacy flag. Also do we need to change the implementation or create a new method, toStandardLegacyPrivateKey, which extracts the private key from the mnemonic and then derives the correct path m/44'/3030'/0'/0'/index'?

ochikov commented 1 year ago

@alittley Did you have the chance to look at the above comment? We are almost ready with the implementation in JAVA and those questions come across during the implementation.

litt3 commented 1 year ago

@ochikov

I would say we shouldn't change the names on functions that are being deprecated (so as to avoid breaking changes)
We should provide a utility function for index hardening. If a bip32 or bip32 utils class exists, it would make sense to put it there, since "hardening" is a concept defined in BIP 32. Otherwise, a utils class seems appropriate. IMO this utility function does not need to be standardized across SDKs
I think Mnemonic.toLegacyPrivateKey could (but shouldn't necessarily have to) be changed, to make the implementation more clear, or improve internal organization. Of course, care should be taken to not modify the external behavior of this function. I would not create a new toStandardLegacyPrivateKey method, though. Since the mnemonic -> key derivation is already non-standard, using the "correct" derivation path won't make the end result any more "standard."

dikel commented 1 year ago

@alittley Can you check the PR in the Java SDK? Do we have to change or add anything else?

dikel commented 1 year ago

@alittley We propose to remove the default parameter values from toStandard() methods because it is not possible to do it in Go.

We also provide test vectors for legacyV1/V2 and toStandard derivations:

Test vector for ED25519

Legacy derive v1:

Mnemonic: jolly kidnap tom lawn drunk chick optic lust mutter mole bride galley dense member sage neural widow decide curb aboard margin manure

Chain m
- private: 00c2f59212cb3417f0ee0d38e7bd876810d04f2dd2cb5c2d8f26ff406573f2bd
- public: 0c5bb4624df6b64c2f07a8cb8753945dd42d4b9a2ed4c0bf98e87ef154f473e9
Chain m/0
- private: fae0002d2716ea3a60c9cd05ee3c4bb88723b196341b68a02d20975f9d049dc6
- public: f40f9fdb1f161c31ed656794ada7af8025e8b5c70e538f38a4dfb46a0a6b0392
Chain m/-1
- private: 882a565ad8cb45643892b5366c1ee1c1ef4a730c5ce821a219ff49b6bf173ddf
- public: 53c6b451e695d6abc52168a269316a0d20deee2331f612d4fb8b2b379e5c6854
Chain m/1099511627775
- private: 6890dc311754ce9d3fc36bdf83301aa1c8f2556e035a6d0d13c2cccdbbab1242
- public: 45f3a673984a0b4ee404a1f4404ed058475ecd177729daa042e437702f7791e9

Legacy derive v2:

Mnemonic: obvious favorite remain caution remove laptop base vacant increase video erase pass sniff sausage knock grid argue salt romance way alone fever slush dune

Chain m
- private: 98aa82d6125b5efa04bf8372be7931d05cd77f5ef3330b97d6ee7c006eaaf312
- public: e0ce688d614f22f96d9d213ca513d58a7d03d954fe45790006e6e86b25456465
Chain m/0
- private: 2b7345f302a10c2a6d55bf8b7af40f125ec41d780957826006d30776f0c441fb
- public: 0e19f99800b007cc7c82f9d85b73e0f6e48799469450caf43f253b48c4d0d91a
Chain m/-1
- private: caffc03fdb9853e6a91a5b3c57a5c0031d164ce1c464dea88f3114786b5199e5
- public: 9fe11da3fcfba5d28a6645ecb611a9a43dbe6014b102279ba1d34506ea86974b

standard derive:

Mnemonic: inmate flip alley wear offer often piece magnet surge toddler submit right radio absent pear floor belt raven price stove replace reduce plate home

Chain m/44'/3030'/0'/0'/0'
- chain code: 404914563637c92d688deb9d41f3f25cbe8d6659d859cc743712fcfac72d7eda
- private: f8dcc99a1ced1cc59bc2fee161c26ca6d6af657da9aa654da724441343ecd16f
- public: 2e42c9f5a5cdbde64afa65ce3dbaf013d5f9ff8d177f6ef4eb89fbe8c084ec0d
Chain m/44'/3030'/0'/0'/2147483647'
- chain code: 9c2b0073ac934696cd0b52c6c521b9bd1902aac134380a737282fdfe29014bf1
- private: e978a6407b74a0730f7aeb722ad64ab449b308e56006c8bff9aad070b9b66ddf
- public: c4b33dca1f83509f17b69b2686ee46b8556143f79f4b9df7fe7ed3864c0c64d0
Chain m/44'/3030’/0’/0’/0’; Passphrase: “some pass”
- chain code: 699344acc5e07c77eb63b154b4c5c3d33cab8bf85ee21bea4cc29ab7f0502259
- private: abeca64d2337db386e289482a252334c68c7536daaefff55dc169ddb77fbae28
- public: fd311925a7a04b38f7508931c6ae6a93e5dc4394d83dafda49b051c0017d3380
Chain m/44'/3030'/0'/0'/2147483647’; Passphrase: “some pass”
- chain code: e5af7c95043a912af57a6e031ddcad191677c265d75c39954152a2733c750a3b
- private: 9a601db3e24b199912cec6573e6a3d01ffd3600d50524f998b8169c105165ae5
- public: cf525500706faa7752dca65a086c9381d30d72cc67f23bf334f330579074a890

Test vector for ECDSAsecp256k1

standard derive:

Mnemonic: inmate flip alley wear offer often piece magnet surge toddler submit right radio absent pear floor belt raven price stove replace reduce plate home

Chain m/44'/3030'/0'/0/0
- chain code: 7717bc71194c257d4b233e16cf48c24adef630052f874a262d19aeb2b527620d
- private: 0fde7bfd57ae6ec310bdd8b95967d98e8762a2c02da6f694b152cf9860860ab8
- public: 03b1c064b4d04d52e51f6c8e8bb1bff75d62fa7b1446412d5901d424f6aedd6fd4
Chain m/44'/3030’/0’/0/0’
- chain code: e333da4bd9e21b5dbd2b0f6d88bad02f0fa24cf4b70b2fb613368d0364cdf8af
- private: aab7d720a32c2d1ea6123f58b074c865bb07f6c621f14cb012f66c08e64996bb
- public: 03a0ea31bb3562f8a309b1436bc4b2f537301778e8a5e12b68cec26052f567a235
Chain m/44'/3030’/0’/0/0; Passphrase: “some pass”
- chain code: 0ff552587f6baef1f0818136bacac0bb37236473f6ecb5a8c1cc68a716726ed1
- private: 6df5ed217cf6d5586fdf9c69d39c843eb9d152ca19d3e41f7bab483e62f6ac25
- public: 0357d69bb36fee569838fe7b325c07ca511e8c1b222873cde93fc6bb541eb7ecea
Chain m/44'/3030’/0’/0/0’; Passphrase: “some pass”
- chain code: 3a5048e93aad88f1c42907163ba4dce914d3aaf2eea87b4dd247ca7da7530f0b
- private: 80df01f79ee1b1f4e9ab80491c592c0ef912194ccca1e58346c3d35cb5b7c098
- public: 039ebe79f85573baa065af5883d0509a5634245f7864ddead76a008c9e42aa758d
Chain m/44'/3030’/0’/0/2147483647; Passphrase: “some pass”
- chain code: e54254940db58ef4913a377062ac6e411daebf435ad592d262d5a66d808a8b94
- private: 60cb2496a623e1201d4e0e7ce5da3833cd4ec7d6c2c06bce2bcbcbc9dfef22d6
- public: 02b59f348a6b69bd97afa80115e2d5331749b3c89c61297255430c487d6677f404
Chain m/44'/3030’/0’/0/2147483647’; Passphrase: “some pass”
- chain code: cb23165e9d2d798c85effddc901a248a1a273fab2a56fe7976df97b016e7bb77
- private: 100477c333028c8849250035be2a0a166a347a5074a8a727bce1db1c65181a50
- public: 03d10ebfa2d8ff2cd34aa96e5ef59ca2e69316b4c0996e6d5f54b6932fe51be560

rwalworth commented 1 year ago

Just a note, you have the 4th index in the derivation chain for the ECDSAsecp256k1 test vectors as hardened, when in fact they are not. I derived everything correctly as you have when using the m/44'/3030'/0'/0/index path.

dikel commented 1 year ago

@rwalworth Thank you for noticing. I've edited my comment

rwalworth commented 1 year ago

I came up with some additional 12 word mnemonic test vectors for ECDSAsecp256k1 and ED25519 private key derivation that should also be added to our internal "standard" test vectors. These test vectors are only for our standard derivation.

Test vector for ED25519

Mnemonic: finish furnace tomorrow wine mass goose festival air palm easy region guilt

Chain m/44'/3030'/0'/0'/0'
- chain code: 48c89d67e9920e443f09d2b14525213ff83b245c8b98d63747ea0801e6d0ff3f
- private: 020487611f3167a68482b0f4aacdeb02cc30c52e53852af7b73779f67eeca3c5
- public: 2d047ff02a2091f860633f849ea2024b23e7803cfd628c9bdd635010cbd782d3
Chain m/44'/3030'/0'/0'/2147483647'
- chain code: c0bcdbd9df6d8a4f214f20f3e5c7856415b68be34a1f406398c04690818bea16
- private: d0c4484480944db698dd51936b7ecc81b0b87e8eafc3d5563c76339338f9611a
- public: a1a2573c2c45bd57b0fd054865b5b3d8f492a6e1572bf04b44471e07e2f589b2
Chain m/44'/3030’/0’/0’/0’; Passphrase: “some pass”
- chain code: 998a156855ab5398afcde06164b63c5523ff2c8900db53962cc2af191df59e1c
- private: d06630d6e4c17942155819bbbe0db8306cd989ba7baf3c29985c8455fbefc37f
- public: 6bd0a51e0ca6fcc8b13cf25efd0b4814978bcaca7d1cf7dbedf538eb02969acb
Chain m/44'/3030'/0'/0'/2147483647’; Passphrase: “some pass”
- chain code: 19d99506a5ce2dc0080092068d278fe29b85ffb8d9c26f8956bfca876307c79c
- private: a095ef77ee88da28f373246e9ae143f76e5839f680746c3f921e90bf76c81b08
- public: 35be6a2a37ff6bbb142e9f4d9b558308f4f75d7c51d5632c6a084257455e1461

Test vector for ECDSAsecp256k1

Mnemonic: finish furnace tomorrow wine mass goose festival air palm easy region guilt

Chain m/44'/3030'/0'/0/0
- chain code: e76e0480faf2790e62dc1a7bac9dce51db1b3571fd74d8e264abc0d240a55d09
- private: f033824c20dd9949ad7a4440f67120ee02a826559ed5884077361d69b2ad51dd
- public: 0294bf84a54806989a74ca4b76291d386914610b40b610d303162b9e495bc06416
Chain m/44'/3030’/0’/0/0’
- chain code: 60c39c6a77bd68c0aaabfe2f4711dc9c2247214c4f4dae15ad4cb76905f5f544
- private: 962f549dafe2d9c8091ac918cb4fc348ab0767353f37501067897efbc84e7651
- public: 027123855357fd41d28130fbc59053192b771800d28ef47319ef277a1a032af78f
Chain m/44'/3030’/0’/0/0; Passphrase: “some pass”
- chain code: 911a1095b64b01f7f3a06198df3d618654e5ed65862b211997c67515e3167892
- private: c139ebb363d7f441ccbdd7f58883809ec0cc3ee7a122ef67974eec8534de65e8
- public: 0293bdb1507a26542ed9c1ec42afe959cf8b34f39daab4bf842cdac5fa36d50ef7
Chain m/44'/3030’/0’/0/0’; Passphrase: “some pass”
- chain code: 64173f2dcb1d65e15e787ef882fa15f54db00209e2dab16fa1661244cd98e95c
- private: 87c1d8d4bb0cebb4e230852f2a6d16f6847881294b14eb1d6058b729604afea0
- public: 03358e7761a422ca1c577f145fe845c77563f164b2c93b5b34516a8fa13c2c0888
Chain m/44'/3030’/0’/0/2147483647; Passphrase: “some pass”
- chain code: a7250c2b07b368a054f5c91e6a3dbe6ca3bbe01eb0489fe8778304bd0a19c711
- private: 2583170ee745191d2bb83474b1de41a1621c47f6e23db3f2bf413a1acb5709e4
- public: 03f9eb27cc73f751e8e476dd1db79037a7df2c749fa75b6cc6951031370d2f95a5
Chain m/44'/3030’/0’/0/2147483647’; Passphrase: “some pass”
- chain code: 66a1175e7690e3714d53ffce16ee6bb4eb02065516be2c2ad6bf6c9df81ec394
- private: f2d008cd7349bdab19ed85b523ba218048f35ca141a3ecbc66377ad50819e961
- public: 027b653d04958d4bf83dd913a9379b4f9a1a1e64025a691830a67383bc3157c044