monero-project / monero

Monero: the secure, private, untraceable cryptocurrency
https://getmonero.org
Other
9.06k stars 3.13k forks source link

[PROPOSAL] Encode restore height as 26th word of the mnemonic seed #6639

Open dEBRUYNE-1 opened 4 years ago

dEBRUYNE-1 commented 4 years ago

The restore height is currently a value that has to be entered manually for wallets that are restored from either the keys or the mnemonic seed. The wallet will essentially ignore blocks (only pulling block hashes) before the restore height and start scanning (looking for transactions that belong to the wallet) from the restore height block.

User experience is degraded if the user accidentally sets a restore height that is too 'high' (i.e. after the first transaction to the wallet), as the wallet will 'miss' certain or all transactions, thereby causing an improper balance (as well as transaction history) to be displayed.

In order to improve user experience, we could encode an approximate restore height as additional word of the mnemonic seed. The restore height would then be set automatically upon restoring the wallet, thereby ensuring users will not inadvertently set an erroneous restore height.

I personally do not see many drawbacks of this proposal. Guides will have to be updated to reflect the new format and users need to be informed. Users further, initially, may be slightly confused due to two different seed formats being present. However, I think ultimately the proposal is net beneficial to user experience.

fluffypony commented 4 years ago

My suggestion is to encode it as follows: the position of the word in the wordlist * 21915 = starting block height. 21915 blocks is about a month's worth of blocks (half a month when the block time was 1 minute), so it gives us a good 135 years worth of coverage.

SChernykh commented 4 years ago

This is too obscure, just use Jun/2020 or something like this instead of a 26th word in the seed.

rbrunner7 commented 4 years ago

As discussed on IRC, summarizing it here for broader publicity and discussion:

I am in full favor of adding one seed word to encode restore height.

But if we touch the seed system and add a "new" kind of seed encoding the restore height, I vote for taking the chance and add two more worthwhile changes at the same time. (Changing anything with seeds will be a larger endeavor, and IMHO it would be a strategic mistake to come back to this with "new new" seeds a year later or so).

The "checksum" as implemented with the checksum word being simply a copy of one of the other words is very weak i.e. it does not catch a lot of errors. This can stay a single checksum word, but it should be calculated using a much more robust algorithm going over all words of the seed.

Furthermore, one more word should get added as the first word of the seed, encoding a seed version. The words used for the version should be different from all other seed words so you can reliably detect whether the first word given is such a version word or not.

This will enable a very robust UX. You can for example generate useful error messages if somebody enters only the first 25 words of a "new" seed for whatever reason, be it conviction that "more than 25 words are wrong", or input forms just not allowing for more words because not yet reworked / upgraded for "new" seeds.

Seed versions would also allow for adjustments in the word list, for whatever crazy reasons that may pop up, like some words becoming "politically incorrect", or more or less banned outright e.g. for Chinese seeds.

IMHO we should stick with words for both version and restore height encoding. Why? Because if it is anything else people will recognize it as something special and because of this some people may not treat them with the same care as the other words and e.g. simply not enter them, based on false assumptions like "I thought that's not part of the seed proper".

Maybe we should even go as far as avoiding that the exactly same word gets added as the version word to each and every Monero "new" seeds, possibly for years, because again people might get confused whether that seemingly constant world really belongs to the seed and is really necessary. People also could fear that Monero seeds are weaker than other coins' seed because of a word being constant.

This could be solved by using only the first letter of the word as the version and e.g. randomly chose from several words starting with that letter.

nim4 commented 4 years ago

Maybe instead of adding a new word we can use first letter of each word to encode the timestamp in days(upper case=1, lower case=0).

For example using (unix timestamp / (60 60 24))

general nomad tail jargon nodes lion scrub juicy palace puffin shipped rift vampire maze axes deity viewpoint timber textbook opened awesome gang object odds object

will be

general nomad tail jargon nodes lion scrub juicy palace puffin Shipped rift vampire maze Axes Deity Viewpoint Timber Textbook Opened Awesome Gang object odds object

fluffypony commented 4 years ago

@nim4 clever idea, but having helped people who have inherited wallets from a deceased spouse you can bet that case sensitivity never factored into it.

fluffypony commented 4 years ago

Regarding the seed version, why do we want to pick a word that isn't on the wordlist? We could just pick a random word, and use the same offset in other wordlists, which means no additional translation work.

I've also tossed around the idea of using a single word for both the version and the block height offset chunk. We could, for instance, use the first 3 bits for the version and last 7 bits for the offset (128 possible offsets, so maybe group it per year). Alternatively, if we really want to eek as much out of it as possible, we could divide the wordlist into 5 groups (so maximum of 5 different versions for this format), and then use the offset in each group, which would give us 325 words per group, so we each offset would be ~3.5 months.

SChernykh commented 4 years ago

Another idea would be to use 27 word seed. 1626^27 ~ 1.008 * 2^288, so we have 256+32=288 bits of storage there. Additional 32 bits could be used for 16-bit checksum (CRC-16 or similar), and 16-bit restore height with 5000 blocks (1 week) precision.

knaccc commented 4 years ago
  1. How much of this UX improvement could come from simply asking a wallet to scan backwards instead of forwards?

  2. More of a stray thought than a proposal: the first 24 words of the base 1626 seed encode 256.01 bits of information, but a seed only needs to be 252 bits. So we have 3 bits extra there. It's fast and easy to brute force seed selection such that the seed mod (135*12) = restore block height / 21915. Since we have 3 bits extra already, this brute force only loses us 8 bits of entropy on the seed. It's already questionable as to how important it is for Monero to have a 256-bit seed instead of a hashed 128-bit seed.

rbrunner7 commented 4 years ago

Regarding the seed version, why do we want to pick a word that isn't on the wordlist?

Because it has many advantages to be able to reliably recognize the word as a version word, or in reverse see that the first given word is not a version for sure. This allows to detect all kinds of possible confusions, wrongly entered seeds, cut-off seeds etc.

I think especially with something as critical and sensitive as seeds we want our UX (and the transition from "old" seeds to "new" seeds) to be as robust as possible.

sumogr commented 4 years ago

How on earth will the cli know the top height if i just want to generate a cold wallet without a daemon running . The above discussion requires an already connected cli wallet to an already fully synced daemon (maybe get the date from the system's timestamp? wouldnt that be dangerous?)

rbrunner7 commented 4 years ago

How on earth will the cli know the top height if i just want to generate a cold wallet without a daemon running . The above discussion requires an already connected cli wallet to an already fully synced daemon (maybe get the date from the system's timestamp? wouldnt that be dangerous?)

You are right, I forgot to mention this from the IRC discussion: There are various situations where restore height is not known. Beside your cold-wallet example, programs generating random seeds offline come to mind.

0 must therefore be a valid value for the encoded restore height, with a meaning of "restore height unknown". This can then be used e.g. to prompt for the restore height when restoring.

trasherdk commented 4 years ago

Does the restore height have to be part of the check-summed seed? Couldn't it just be a 32 bit hexadecimal number appended as 26st. word? If it's there, it's the restore height. If not, ask.

fluffypony commented 4 years ago

@trasherdk a single word is only 10 bits of entropy, so can't encode the actual restore height, but yes - this proposal is about adding an additional word for the restore height, plus a 27th word for versioning.

trasherdk commented 4 years ago

The 25 word seed is pretty much set in stone for all eternity, unless you are willing to abandon all those paper-wallets out there, hidden in madrases or something. Right?

fluffypony commented 4 years ago

@trasherdk I don't understand how this affects paper wallets? It's not like the old seed format would no longer be supported, there'd just be a new, default seed format. We already did this with the old English and new English wordlists, the old English wordlist still exists and you can restore an old paper wallet any time you want.

rbrunner7 commented 4 years ago

The 25 word seed is pretty much set in stone for all eternity, unless you are willing to abandon all those paper-wallets out there, hidden in madrases or something. Right?

Yes. There will be "new" seeds and "old" seeds with us forever. That's one reason why I am so vocal in favor of a system that is able to distinguish in a crystal-clear way between both sorts.

The first 25 words of a "new" seed should better not be a valid "old" seed for a system that, for whatever reason, never learned about "new" seeds. New version words outside the current word lists would nicely take care of this, because they make "new" seeds flat-out invalid for an "old" system. You won't be able to do something that only looks like a correct restore with the first 25 words of a "new" seed on an old system.

fluffypony commented 4 years ago

@rbrunner7 it's already 2 words longer than the "old" seeds, so I don't think we need to worry about validity. Also if we move the checksum to the end, and make it a checksum valid for the whole of the new seed (and not just the key portion), then it'll fail checksum validation on an older wallet anyway.

I would like to keep the discussion going around versioning, as I've not yet heard an argument for an out-of-band word that makes sense to me, or even an argument for putting the version in an entire word instead of using the extra bits we gain from adding 1 word for both versioning AND initial block offset chunk.

trasherdk commented 4 years ago

Okay, so far. Is there any reason the 26st. word cant be 00205263 for Height 2118243 ?

knaccc commented 4 years ago

Okay, so far. Is there any reason the 26st. word cant be 00205263 for Height 2118243 ?

Excellent point. Or as @asymptotically508 wrote on reddit, "I just write the date on the same paper as the seed.".

rbrunner7 commented 4 years ago

I would like to keep the discussion going around versioning

Fair enough.

Just for completeness sake: The GUI wallet currently does not insist on the 25th / checksum word, it also accept the "naked" 24 words. Not sure about the CLI wallet.

rbrunner7 commented 4 years ago

Excellent point. Or as @asymptotically508 wrote on reddit, "I just write the date on the same paper as the seed.".

Sure, but this assumes that people know about restore heights and their importance in the first place. Count the people on the Monero subreddit that don't and e.g. fail to correctly restore a wallet. (If they knew, and just did not know the correct restore height, they could easily go back far enough to be safe. It seems they often don't.)

Which is an important part of the motivation to touch the seed system and integrate the restore height, to do away with such problems as best as possible.

knaccc commented 4 years ago

@rbrunner7 I agree, writing the date down and getting it wrong later can create problems.

I'll therefore revert to proposing the much more foolproof solution of changing nothing with the seed and just making the wallet scan from the current block backwards.

rbrunner7 commented 4 years ago

just making the wallet scan from the current block backwards.

Maybe I stupidly overlook something, but I have no idea how you would know when to stop scanning. How can you be sure my first transaction is not in block #1?

knaccc commented 4 years ago

just making the wallet scan from the current block backwards.

Maybe I stupidly overlook something, but I have no idea how you would know when to stop scanning. How can you be sure my first transaction is not in block #1?

What does it matter whether the wallet still has to scan the entire blockchain? If this is about UX, all that matters is that we show people what looks like their balance as quickly as possible.

If Monero has an Eternal September then this solves the waiting problem for most.

rbrunner7 commented 4 years ago

What does it matter whether the wallet still has to scan the entire blockchain? If this is about UX, all that matters is that we show people what looks like their balance as quickly as possible.

Interesting approach which I might be able to agree with, if it were not for the weak checksum problem and the advantages that some sort of versioning brings as additional arguments to improve seeds.

fluffypony commented 4 years ago

Okay, so far. Is there any reason the 26st. word cant be 00205263 for Height 2118243 ?

Yes, that's not a word, and can't be encoded into many physical wallets (eg. Cryptosteel).

knaccc commented 4 years ago

Interesting approach which I might be able to agree with, if it were not for the weak checksum problem and the advantages that some sort of versioning brings as additional arguments to improve seeds.

I agree your proposal is better, if we were starting from scratch. I just don't think that due appreciation has been given to the confusion that will be caused when all of the documentation and tutorials and paper wallets suddenly have to start talking about 25 vs 26 word seeds.

rbrunner7 commented 4 years ago

I just don't think that due appreciation has been given to the confusion

A difficult assessment for sure. I hope for many people voicing their opinions here and on the Monero subreddit. I think Monero might have it easier here than many other coins because users were subjected to frequent changes anyway so far, with all our hardforks ...

sumogr commented 4 years ago

Humbly and just to give my two pennies worth

void simple_wallet::print_seed(const epee::wipeable_string &seed)
{
  auto timenow =  chrono::system_clock::to_time_t(chrono::system_clock::now()); 
  success_msg_writer(true) << "\n" << "Seeds generated at: " << ctime(&timenow) << "\n"; 
  success_msg_writer(true) << "\n" << boost::format(tr("NOTE: the following %s can be used to recover access to your wallet. "
    "Write them down and store them somewhere safe and secure. Please do not store them in "
    "your email or on file storage services outside of your immediate control.\n"
    "When restoring from seeds please use the date above to avoid needlessly scanning the entire chain.\n")) % (m_wallet->multisig() ? tr("string") : tr("25 words")); 

No extra word, no confusion, monero has already too many seed words compared to btc clones.

fluffypony commented 4 years ago

I don't buy the "let's not add extra words" story - 25, 26, or 27 words makes no difference to the end user. I also don't think that trying to force the user to write down a Unix timestamp is useful either, as that genuinely is an additional piece of out-of-band data that users will not always be able to write down (eg. if they use a CryptoSteel), nor can we communicate to them easily what "needlessly scanning the entire chain" actually means.

I would encourage people to have a non-technical friend try use the Monero GUI, and you'll quickly see how frightening mnemonic seeds are already. If we can make them easier to use then that's a win. And to be sure, abstracting any complexity around figuring out what seed it is will be abstracted away from the user, just like we don't ask them to specify the seed language before entering it in. They just type in their seed, and the wallet will figure out everything else.

ghost commented 4 years ago

The restore height if added should also be made part of the checksum since the goal is to improve the experience for the new users, if the restore height is entered incorrectly it may lead to the user not having access to certain funds if they doesn't notice. Extending the seed will likely not be backwards compatible with all wallets anyway. Old mnemonic phrases can easily be supported in newer wallet versions. If someone generates a mnemonic seed in newer software the likelihood of them then later entering that seed in older software is low. Also on the UI side not to confuse the user the software can simply ask for a "mnemonic seed" / "mnemonic phrase" without mentioning the amount of words it contains. If deemed necessary the UI could mention the different possible word counts in a tooltip: "NOTE: The mnemonic phrase is a list of 13, 24, 25 or 26 words".

On another note, we should use something like a digital pin board for the different approaches / proposals where people can add pros and cons to each so that there can be a debate that's simpler to follow.

rbrunner7 commented 4 years ago

The restore height if added should also be made part of the checksum

Fully agree. Plus the version if we come around to add one. In effect, checksum goes over everything.

fluffypony commented 4 years ago

The restore height if added should also be made part of the checksum

Fully agree. Plus the version if we come around to add one. In effect, checksum goes over everything.

Agreed, sarang had some ideas about that as well.

knaccc commented 4 years ago

I don't buy the "let's not add extra words" story - 25, 26, or 27 words makes no difference to the end user.

Here is an example of where it makes a difference:

One of the easiest ways to mess up is to miss out a word when writing down a seed.

Paper wallets are therefore very useful, because they provide exactly 25 boxes. Someone will notice immediately if they've missed out a word and not filled in all of the boxes.

Think of how many people will download a paper wallet that has not been updated to allow for 26 words, and accidentally skip a word because there are only 25 boxes.

Also think of the confusion when they have a 26 word seed and the paper wallet doesn't have enough boxes.

Note that I'm not making a stand against the proposal. I'm just saying that in my opinion, it doesn't seem like an earth-shatteringly great improvement to me, compared with just scanning backwards.

fluffypony commented 4 years ago

@knaccc paper wallets can be redesigned to add an extra box or two, surely? Seems like a pretty easy fix that would accompany a major change to the seed structure.

knaccc commented 4 years ago

@knaccc paper wallets can be redesigned to add an extra box or two, surely? Seems like a pretty easy fix that would accompany a major change to the seed structure.

I'm sure they can be. I'm just pessimistic about the percentage of paper wallet designs out there that people will be bothered to update.

fluffypony commented 4 years ago

Regarding scanning backwards: it's not terrible, but it does assume you've synced up first before you can start scanning. On a scan-forwarder scenario you could / should be able to sync up and scan at the same time.

fluffypony commented 4 years ago

I'm sure they can be. I'm just pessimistic about the percentage of paper wallet designs out there that people will be bothered to update.

That's a fair argument; besides the Monero.how paper wallet are there any other designs?

knaccc commented 4 years ago

but it does assume you've synced up first before you can start scanning

Are you suggesting it's acceptable from a security perspective to start downloading and verifying the blockchain mid-way, instead of from the beginning? At first glance, that does not seem wise.

Edit: I see now that's not what you're suggesting.

If you have not synced up the entire blockchain yet, there is nothing to stop you scanning backwards from wherever you are up to so far. In fact, that will be normal, because a new block may appear while you are backwards-scanning.

rbrunner7 commented 4 years ago

Also think of the confusion when they have a 26 word seed and the paper wallet doesn't have enough boxes.

I think perspective changes many things. I already quipped a few times on the Monero subreddit that our ancestors will see this and that or remember this and that on the 100th anniversary of the Monero genesis block in the year 2114. And this is not only joking: Monero, if really successful, will stay for decades. So I propose to plan for this. And a very, very robust seed system is part of that. And any succession period a small blipp in Monero's history.

knaccc commented 4 years ago

besides the Monero.how paper wallet are there any other designs?

Having searched, other than that one, the only other popular one I could find is this : https://www.themonera.art/2018/01/30/printable-monero-paper-wallet-pack-1/

Paper wallets maybe not the biggest problem then. It'd be more to do with the number of references on the web to "25 word seeds", and whether that would cause confusion or not.

I'm just being conservative by nature. I've seen how confused and worried people get about little things like "I'm scared to use the first 4... address in my wallet because I heard subaddresses provide more privacy".

selsta commented 4 years ago

We added the restore height a while back to the wallet creation screen though I can’t say how much this helped with support requests.

Screenshot 2020-06-11 at 17 14 12

Also one thing to consider: Hardware wallets will always need a restore height because they don’t display a seed.

knaccc commented 4 years ago

We added the restore height a while back to the wallet creation screen though I can’t say how much this helped with support requests

I assume though that simply scanning backwards would have had the same effect.

selsta commented 4 years ago

I assume though that simply scanning backwards would have had the same effect.

Can you explain a bit more? Scanning backwards for how long?

Cactii1 commented 4 years ago

I assume though that simply scanning backwards would have had the same effect.

The wallet would still have to scan to block 0 as it doesn't know wallet creation date.

knaccc commented 4 years ago

Can you explain a bit more? Scanning backwards for how long?

You would always go backwards all the way back to zero. The point is though, you start to show people that things are working much more quickly, since most users will have txs near the end of the blockchain and not near the beginning. So you still go back all the way to zero, but that's now all in the background and all of the anxiety that their wallet balance may not appear will be gone.

rbrunner7 commented 4 years ago

all of the anxiety that their wallet balance may not appear will be gone.

A nice touch of this idea. A little minus point: It's possible it will show as negative over some scan time.

Note that we still can switch to scanning backwards, even with seeds that contain the proper blockheight.

tobtoht commented 4 years ago

3 issues on the topic of seeds:

We should look to decrease this number, not increase it. Does the average user even need 256 bits of security considering that the birthday problem is a complete nonissue if you factor in the time it takes to scan for outputs? It doesn't appear to be an issue on Bitcoin, and they have transparent balances.

Long seeds ruin the flow of disposable wallets with temporary paper backups because it's a pain to write the seed down every time. Long seeds also increase the likelihood of mistakes / missing words / bad ordering when copying.

Can we make the MyMonero-style seed the default for GUI users? Or at least have it as an option?

Scanning the blockchain from height 0 over Tor can take hours, if not up to a day. For many users and for many different reasons running a local node is not got going to happen. To mitigate network-level metadata leakage syncing the wallet over Tor should be a supported use-case. Having the wallet sync backwards to block 0 will greatly decrease the practicality of this.

I like philogoly's idea to encode the restore height in the checksum word. Round it down to the nearest xK blocks or month, whatever fits. This way the concept of the restore height is abstracted away. Users no longer have to think about it. Wallets no longer have to explain the concept. Unnecessary syncing time is cut down by a substantial amount. Just enter your seed and you will always restore your wallet the same way every time.

sumogr commented 4 years ago

I don't buy the "let's not add extra words" story - 25, 26, or 27 words makes no difference to the end user. I also don't think that trying to force the user to write down a Unix timestamp is useful either, as that genuinely is an additional piece of out-of-band data that users will not always be able to write down (eg. if they use a CryptoSteel), nor can we communicate to them easily what "needlessly scanning the entire chain" actually means.

I would encourage people to have a non-technical friend try use the Monero GUI, and you'll quickly see how frightening mnemonic seeds are already. If we can make them easier to use then that's a win. And to be sure, abstracting any complexity around figuring out what seed it is will be abstracted away from the user, just like we don't ask them to specify the seed language before entering it in. They just type in their seed, and the wallet will figure out everything else.

the advantage of monero compared to mimblewimble coins (whose tech i admire), while both serving the same purpose, is its comparative ease of use compared to the hell a mimblewimble coin user has to go through to get acquainted with their cli. Monero users had to suffer many changes during the past few years. Restoring from seeds happens once or twice during a user's "lifetime", a user restoring casually the same wallet is already acquainted with it and knows where to restore it from. There will be two kind of legacy seeds after this is applied. I think it will not make a difference in user experience (for example someone who knows how to setup tor and run monerod over socks knows for sure which height to restore his wallet from to avoid waiting for hours). Anyway :P

fluffypony commented 4 years ago

@sumogr I manned the MyMonero support email for a few years. Users restore more often than you think, and struggle way more than you'd expect. Most users do NOT know what they're doing.