hashgraph / hedera-sdk-reference

Hedera SDK specification repository.
Apache License 2.0
5 stars 1 forks source link

Refactor key derivation #73

Open litt3 opened 1 year ago

litt3 commented 1 year ago

There are problems with how key derivations are currently implemented. This issue is for discussing the best solution to the existing problems, and settling on a course of action.

Background

Proposal (will be updated as discussion occurs)

Converting Mnemonics to Private Keys

  1. Mnemonic.toStandardEd25519PrivateKey(passphrase="", index=0)
    • returns the child key at derivation path m/44'/3030'/0'/0'/index'
    • this function automatically hardens index. Passing in a pre-hardened index should fail
  2. Mnemonic.toStandardECDSAsecp256k1PrivateKey(passphrase="", index=0)
    • returns the child key at derivation path m/44'/3030'/0'/0/index
    • if the user wants a hardened child, they must manually harden index
  3. Mnemonic.toLegacyPrivateKey() returns an Ed25519 private key
    • this function should choose the type of legacy derivation based on mnemonic word count / composition
      • 22 words from the non-standard list means legacy derivation v1
      • 24 words from the standard list means legacy derivation v2

Deriving Child Keys

  1. PrivateKey.derive(index)
    • returns the child key at index
    • if PrivateKey is an Ed25519 key, index is automatically hardened. passing in a pre-hardened index should fail
    • if PrivateKey is an ECDSA key, index must be manually hardened, if desired
  2. PrivateKey.legacyDerive(index)
    • returns the child key at index, using the legacy algorithm
    • only valid for Ed25519 keys. Calling this on an ECDSA key should fail

Deprecated Functions (including but not necessarily limited to)

  1. Mnemonic.toEd25519PrivateKey(passphrase = "", path = ...)
    • since using the default path returns a key that shouldn't be used
  2. Mnemonic.toECDSAPrivateKey(passphrase = "", path = ...)
    • since the default path is entirely wrong, and returns a non-standard key (JS only)

Tests

  1. All test vectors described in SLIP10, ~BIP32~, and BIP39 should be checked for all key types
    • Some refactoring will be necessary to support test vectors from seed
    • BIP-32 test requirement removed for now, since base58 extended key encoding isn't supported
  2. The following test vectors should be created and standardized across SDKs
    • legacyMnemonicV1
      • mnemonic to key
      • legacy derivation (if applicable, see question # 1 below)
    • legacyMnemonicV2
      • mnemonic to key
      • legacy derivation (if applicable, see question # 1 below)
    • toStandardEd25519PrivateKey
      • mnemonic to key with no passphrase and index 0
      • mnemonic to key with passphrase and index 0
      • mnemonic to key with passphrase and index max
    • toStandardECDSAsecp256k1PrivateKey
      • mnemonic to key with no passphrase and index 0
      • mnemonic to key with no passphrase and index 0'
      • mnemonic to key with passphrase and index 0
      • mnemonic to key with passphrase and index 0'
      • mnemonic to key with passphrase and index max
      • mnemonic to key with passphrase and index max'

Possible future functionality (not included in the current proposal)

  1. Functions that allow the user to explicitly define a full derivation path
    • The functions that do this are being deprecated
    • Since this is presumably not a commonly used feature, it doesn't need to be added urgently
  2. Public key derivation
    • inherently not possible with Ed25519
    • should eventually be implemented for ECDSA

Questions

  1. When is PrivateKey.legacyDerive(index) actually needed? Is this just for one of the legacy versions, or both?
    • this needs to be figured out and documented
  2. Can we remove the isLegacy flag from Mnemonic?
    • There currently exists some functionality to construct a mnemonic with an isLegacy flag, or to infer isLegacy based on word count
    • I personally don't like this at all. Not all SDKs do this the same way, and isLegacy actually seems to mean "is legacy v1", since legacy v2 mnemonics are notably not isLegacy
    • IMO we should rip this confusing bandaid off now, and just require the wallet implementor to explicitly call toLegacyPrivateKey, rather than inconsistently and imperfectly inferring isLegacy
  3. Should we specify the curve in the ECDSA function names? (see below for details)
litt3 commented 1 year ago

@mehcode Yes, JS not allowing function overrides makes this harder. I wasn't aware of that limitation.

Proposal v2

  1. Implement new functions toStandard[Ed25519|Ecdsa]PrivateKey(passphrase="", index=0), such that the returned key is at derivation path m/44'/3030'/0'/0'/index' for Ed25519 keys, and m/44'/3030'/0'/0/index for ECDSA keys.
    • these functions should be what developers use moving forward
    • note that the Ed25519 version of this function automatically hardens all indices. passing in a pre-hardened index should cause an error
    • we can haggle over the naming of this new function
    • since this is an entirely new function, we can have a default index :)
  2. Deprecate the existing functions to[Ed25519|Ecdsa]PrivateKey(passphrase = "", path = ...)
    • do not change the implementation at all, so that existing usages continue working

Meets criteria

  1. Existing code will work exactly like before
  2. Methods will exist that allow recovering wallets that were incorrectly generated
  3. Developers moving forward will no longer be able to accidentally retrieve keys at non-standard derivation paths

Downsides

Other considerations

Thoughts?

ochikov commented 1 year ago

@alittley I've reviewed your proposition. We were discussing at the issue as well that JS does not allow overrides. Some notes here:

litt3 commented 1 year ago

@ochikov Concerning explicit paths: toStandard[Ed25519|Ecdsa]PrivateKey(passphrase="", index=0) cannot support them as written above, since it is only accepting a single index. A full fledged derivation path would have to accept a path

Now, concerning legacy key derivation: there are subtle differences in how the SDKs handle this currently. Rather than try to detail all the differences, I'm just going to start a proposal for how it can be standardized.

  1. Function mnemonic.toLegacyPrivateKey() returns an Ed25519 key
    • this function should choose the type of legacy derivation based on mnemonic word count / composition
      • 22 words from the non-standard list means legacy derivation v1
      • 24 words from the standard list means legacy derivation v2
    • There currently exists some functionality to construct a mnemonic with an isLegacy flag, or to infer isLegacy based on word count
      • I don't like this at all. Not all SDKs do this the same way, and isLegacy actually means "is legacy v1", since legacy v2 mnemonics are notably not isLegacy
      • IMO we should rip this bandaid off now, and just require the wallet implementor to explicitly call toLegacyPrivateKey, rather than inconsistently and imperfectly inferring isLegacy. I am in favor of deprecating all functions that accept an isLegacy argument, and removing the isLegacy member from mnemonic classes
  2. Function PrivateKey.legacyDerive(index) returns the child key at index
    • This is only valid for Ed25519 keys
    • TBH I have no idea the when exactly this is supposed to be used. Is the existing legacyDerive function valid for v1 legacy, v2 legacy, or both? It's not documented as far as I can tell.

Some other things to do with legacy keys that need addressing:

litt3 commented 1 year ago

Before this goes into effect, we will need to resolve the ECDSA naming discussion. Should we specify the secp256k1 curve in newly created functions, or stick with the existing ambiguous naming?

Argument for excluding the curve from new function names:

Argument for including curve in new function names:

petreze commented 1 year ago

@alittley During the implementation of the test vectors for SLIP-10, BIP-39 and BIP-32, we've discovered that we do not support the generation of the xpub and xprv keys menitoned here inside the BIP-32 specs. As far as I understand, BIP-32 is basically the standard upon which we derive ECDSA child keys for HD(Hierarchical Deterministic) wallets based on the BIP-39 standard? Can you verify if this is true?

litt3 commented 1 year ago

I hadn't noticed that the BIP-32 test vectors relied on an encoding we don't have 🤔 I think it's ok for now that the BIP-32 test vectors aren't implemented, since SLIP10 provides alternative vectors for Ed25519 and ECDSA_secp256k1, which don't rely on that extended key encoding.

ochikov commented 1 year ago

@petreze @alittley Guys, do you think that we can start implementing the refactoring soon? What about the curve into the name of the functions?

litt3 commented 1 year ago

@ochikov I see the following things that must be completed:

  1. Resolution of the 3 questions listed in the issue body
  2. Acknowledgement from someone familiar with each SDK that the proposal is sensible, and that there are no outstanding concerns (perhaps we already have this?)
  3. Definition of additional common test vectors
    • I edited the issue body to explicitly call out which additional vectors I think are necessary. These are needed, since the BIP and SLIP test vectors don't take our specific derivation path into account
SimiHunjan commented 1 year ago

1/10/2023

1/23/2023 Majority vote: ECDSAsecp256k1 for naming convention

ochikov commented 1 year ago

Answer to the questions from the first post:

  1. PrivateKey.legacyDerived is used for both v1 and v2 legacy derivation. There are tests in all of the SDKs (Java, JS, Go) that use the method. In Java they are:

    • MnemonicTest.thirdMnemonicTest: creates a 24-word mnemonic; creates legacy private key (it is v2 because the mnemonic has 24 words); derives 2 keys using legacyDerive()
    • MnemonicTest.legacyMnemonicTest: creates a 22-word mnemonic; creates legacy private key (it is v1 because the mnemonic has 22 words); derives 2 keys using legacyDerive()
    • MnemonicTest.myHbarWallerV1Test: same as legacyMnemonicTest but uses different derivation index
  2. I consider that isLegacy can be removed

Additional questions:

  1. Should we change ECDSA to ECDSAsecp256k1 only in the new function toStandardECDSAsecp256k1PrivateKey or change all instances of ECDSA(methods and classes)?
  2. Do we need to create a toHardenedIndex() method or do we expect the users to harden the indexes themselves if they need to? toStandardECDSAsecp256k1PrivateKey("", 2147483648) vs toStandardECDSAsecp256k1PrivateKey("", toHardenedIndex(0) . If it is needed, should it be in the PrivateKeyECDSA, the Mnemonic or a Utils class?
  3. Should we change Mnemonic.toLegacyPrivateKey? In Java it uses the number of words to determine if the private key should be v1 or v2. In JS it uses the isLegacy flag. Also do we need to change the implementation or create a new method, toStandardLegacyPrivateKey, which extracts the private key from the mnemonic and then derives the correct path m/44'/3030'/0'/0'/index'?
ochikov commented 1 year ago

@alittley Did you have the chance to look at the above comment? We are almost ready with the implementation in JAVA and those questions come across during the implementation.

litt3 commented 1 year ago

@ochikov

  1. I would say we shouldn't change the names on functions that are being deprecated (so as to avoid breaking changes)
  2. We should provide a utility function for index hardening. If a bip32 or bip32 utils class exists, it would make sense to put it there, since "hardening" is a concept defined in BIP 32. Otherwise, a utils class seems appropriate. IMO this utility function does not need to be standardized across SDKs
  3. I think Mnemonic.toLegacyPrivateKey could (but shouldn't necessarily have to) be changed, to make the implementation more clear, or improve internal organization. Of course, care should be taken to not modify the external behavior of this function. I would not create a new toStandardLegacyPrivateKey method, though. Since the mnemonic -> key derivation is already non-standard, using the "correct" derivation path won't make the end result any more "standard."
dikel commented 1 year ago

@alittley Can you check the PR in the Java SDK? Do we have to change or add anything else?

dikel commented 1 year ago

@alittley We propose to remove the default parameter values from toStandard() methods because it is not possible to do it in Go.

We also provide test vectors for legacyV1/V2 and toStandard derivations:

Test vector for ED25519

Legacy derive v1:

Mnemonic: jolly kidnap tom lawn drunk chick optic lust mutter mole bride galley dense member sage neural widow decide curb aboard margin manure

Legacy derive v2:

Mnemonic: obvious favorite remain caution remove laptop base vacant increase video erase pass sniff sausage knock grid argue salt romance way alone fever slush dune

standard derive:

Mnemonic: inmate flip alley wear offer often piece magnet surge toddler submit right radio absent pear floor belt raven price stove replace reduce plate home

Test vector for ECDSAsecp256k1

standard derive:

Mnemonic: inmate flip alley wear offer often piece magnet surge toddler submit right radio absent pear floor belt raven price stove replace reduce plate home

rwalworth commented 1 year ago

Just a note, you have the 4th index in the derivation chain for the ECDSAsecp256k1 test vectors as hardened, when in fact they are not. I derived everything correctly as you have when using the m/44'/3030'/0'/0/index path.

dikel commented 1 year ago

@rwalworth Thank you for noticing. I've edited my comment

rwalworth commented 1 year ago

I came up with some additional 12 word mnemonic test vectors for ECDSAsecp256k1 and ED25519 private key derivation that should also be added to our internal "standard" test vectors. These test vectors are only for our standard derivation.

Test vector for ED25519

Mnemonic: finish furnace tomorrow wine mass goose festival air palm easy region guilt

Test vector for ECDSAsecp256k1

Mnemonic: finish furnace tomorrow wine mass goose festival air palm easy region guilt