zcash / zips

Zcash Improvement Proposals
https://zips.z.cash
MIT License
268 stars 152 forks source link

[ZIP 332] Wallet Recovery from zcashd HD Seeds #675

Open nuttycom opened 1 year ago

nuttycom commented 1 year ago

When Sapling was released, zcashd implemented HD derivation of Sapling addresses in a fashion that was inconsistent with HD derivation according to BIP 44. In version 4.7.0 zcashd introduced HD derivation from a mnemonic seed according to BIP 32 and BIP 44, with a nonstandard accommodation in the generation of the mnemonic seed to make it possible to also reproduce previously derived Sapling keys. This accommodation needs to be documented, along with the process for correct discovery of such previously-derived Sapling keys.

In addition, in order to continue allow zcashd's legacy transparent APIs such as getnewaddress and z_getnewaddress to continue to function, zcashd introduced the idea of the ZCASH_LEGACY_ACCOUNT constant for use in address derivation consistent with the previous semantics of those methods. Derivation of keys under ZCASH_LEGACY_ACCOUNT is also nonstandard with respect to BIP 32 and BIP 44, and so needs to be properly documented here in order to make it possible for other wallet implementations to correctly rediscover funds controlled by keys derived using this mechanism.

daira commented 1 year ago

Assigned ZIP number 332, and added a stub.

zancas commented 1 year ago

Can we add a URL linking to any tests that check which addresses are generated from which seeds in various regimes?

Also especially useful would be test vectors that can be referenced across projects.

nuttycom commented 2 months ago

A thread that might serve as an outline here:

[1:14 PM] nuttycom: Zingo will not correctly restore Sapling funds from zcashd wallets created prior to version 4.7.0; prior to that, the HD seeds were raw binary seeds and not BIP 39 seed phrases. [1:15 PM] nuttycom: And the seed phrases cannot be used to recover transparent or Sprout key material from before that release, because those keys were derived directly from system randomness and not from the Sapling HD seed (this was the behavior inherited from Bitcoin.) [1:18 PM] nuttycom: Any new accounts or addresses created since 4.7.0 should be recoverable from the mnemonic seed, but for addresses derived using the old Bitcoin getnewaddress API they exist under the maximum possible account index of the ZIP 32 derivation tree. This is also true of Sapling addresses derived using z_getnewaddress starting from 4.7.0. [1:22 PM] nuttycom: This nightmare-fuel scenario exists because when we added derivation from the mnemonic seed, we couldn't break the contract of the existing API methods: getnewaddress treats all transparent addresses in the wallet as referring to a single pool of funds, but z_getnewaddress treated each distinct Sapling address as having its own separate pool of funds. So neither of those could use the default ZIP 32 account 0 derivation path, so we stuck those under the maximum account index. (edited) [1:23 PM] nuttycom: The alternative would have been to disable getnewaddress and z_getnewaddress entirely, but that would have broken a bunch of existing users. [1:31 PM] nuttycom: Also, because we wanted to ensure that Sapling wallets that went through the v4.7.0 upgrade could restore pre-v4.7.0 addresses, we had to construct the mnemonic seed for such wallets directly from the existing HD seed. Now, BIP 39 is annoying in that instead of being a direct encoding of the bytes of the seed, the seed is produced as a hash of the seed phrase (https://bips.xyz/39#user-content-from-mnemonic-to-seed) so there was no way to produce a mnemonic phrase that hashed to the existing HD seed. So, such wallets effectively have two seeds: the mnemonic phrase is produced by directly encoding the bytes of the seed using the BIP 39 wordlist. Reversing that encoding gets the legacy seed. Then, for all addresses generated since the v4.7.0 upgrade, the seed is produced via ordinary BIP 39 derivation from that mnemonic, and these addresses can be recovered by BIP-39 supporting wallets, with the caveat that they also need to check under the maximum ZIP 32 account ID to find the addresses that were produced by the legacy APIs I mentioned earlier.