toncenter / tonweb-mnemonic

Mnemonic code for generating deterministic keys for TON (The Open Network)
GNU General Public License v3.0
49 stars 19 forks source link

Very bad practice with `pbkdf2Sha512` salt #13

Open cryshado opened 2 years ago

cryshado commented 2 years ago

What has been noticed

I noticed that the tonweb source code uses a specific salt in the mnemonicToSeed function. The salt (which for some reason is named seed) is defined as 'TON default seed'. Other wallets, applications, and services set the same salt value in pbkdf2Sha512 to maintain compatibility.

The main problem and motivation

In the case that any of the wallet applications close and/or this repository becomes unavailable, users do not have a full understanding of how to recover private keys from the mnemonic phrase.

Problem solution

Only changing the salt value in this repository can fix the problem, since this repository is the basis for future applications. I suggest using "mnemonic" + passphrase, as a salt, as many popular crypto wallets do for other coins. "mnemonic" it's just a string. It is also possible to suggest that the user set passphrase to create additional "hidden" wallets to his mnemonic phrase.

cryshado commented 2 years ago

I also suggest documenting this point very well and recommending that wallets provide instructions on how to generate private keys to their users

ex3ndr commented 2 years ago

No problem stated: this repo is not going anywhere soon and not going to change it’s code by obvious reasons.

instead of documentation improvements proposed to change code. Why?

cryshado commented 2 years ago

No problem stated: this repo is not going anywhere soon and not going to change it’s code by obvious reasons.

instead of documentation improvements proposed to change code. Why?

At least using salt with non-printing characters is not a good example. It is suggested that we follow the example of other developers who set the salt to "" or "mnemonic" + passphrase. Users without specific documentation will be able to find how others do and learn about Ed25519 from the TON documentation, e.g. fiftbase.pdf

ex3ndr commented 2 years ago

How it is different from putting documentation on ton.org? Why change the code?

cryshado commented 2 years ago

How it is different from putting documentation on ton.org? Why change the code?

Please read my posts above more attentively. If a particular resource is not available, it can be problematic

ex3ndr commented 2 years ago

ton.org? And all forks of all wallets are unavailable? I suppose ton is dead then.

aagolovanov commented 2 years ago

Agree with issue author. In my subjective opinion, all potential problems with cryptographic algorithms have to be solved in the bud, because this may cause problems with cryptography in future applications. The algorithm in the current repository must be replaced by the correct one.

ex3ndr commented 2 years ago

What’s not correct?

tolya-yanot commented 2 years ago

'TON default seed' was originally used as a salt in tonlib and is used in all standard native wallets.

TonWeb uses the same salt.

For the password option, we can use 'TON default seed' + passphrase.

So you offer just another string ('mnemonic' instead of 'TON default seed'). It doesn't increase security.

We can’t just change the salt, because a huge number of users are already using wallets.

All source codes are open and anyone can see how the mnemonic is generated.

On ton.org/docs we really should add the mnemonic algorithm and the salt constant.

If you just like a different constant, then we won't change code. If I missed something, please reply

cryshado commented 2 years ago

@tolya-yanot thanks for the reply in this issue

For first most likely tonlib used a set of test values, and this does not mean that tonlib is the truest source. What I am talking about is defined in bip-0039 and is often inherited by developers in other blockchains (not bitcoin). This brings some transparency for developers and users, they can at least guess how their key pair is formed. If we want to change some default values, we can suggest that wallets continue to support the old salt and e.t.c. values and use new similar bip-0039.

Even if there is no decision to change the default values to the new ones, I am very glad that you think it is a good decision to add information about the key pair generation algorithms to the official documentation.

But I have some another question about this case. How can you explain export const PBKDF_ITERATIONS = 100000;? For example, the standard bip-0039 suggests that we use 2048 iterations. Not to mention the fact that the mnemonic itself has a huge entropy, in combination with PBKDF2 it is really resistant to bruteforcing with 2048 iterations. The value of 100,000 iterations looks like a real test, and has no justification, but only makes the process of generating a key pair more costly in terms of resources.

AntonMeep commented 2 years ago

I agree that there are quite a few questionable design choices regarding mnemonic generation and processing, namely:

  1. As mentioned in this issue, same salt everywhere. I disagree with the problem being in understanding of how private keys are being generated, I think main issue is missing out on what salt helps us achieve in cryptography, quote:

    Salts create unique passwords even in the instance of two users choosing the same passwords. Salts help us mitigate hash table attacks by forcing attackers to re-compute them using the salts for each user.

While it is obvious that there's no way one can guess full 24-words of the mnemonic sentence, this means that we are not utilizing salt to further strenghten security

  1. An obscene number of PBKDF2 iterations As already mentioned, having 100k iterations for entropy as huge as the one produced by the mnemonic phrase is extremely excessive and doesn't seem to be justified.

  2. So-called "basic" and "password" seeds This apparently was introduced in order to let users protect their mnemonic phrases with a password, even though that's what salt could be used for. Instead, what we have now is reduced entropy of a mnemonic phrase, since not all of them will resolve to a basic or password seed, and are deemed "invalid". Unfortunately, I do not posses the knowledge nor expertise to calculate how big the impact of this is, but in cryptography it is generally not advised to have unclear and obscure decisions such as this one. Furthermore, this unncessarily increases time that is needed to generate a mnemonic phrase since we have to generate one after another until we find one that resolves in a basic or a password seed.

My proposed solution would be as follows:

  1. Remove basic and password seed check, make all mnemonic phrases valid.
  2. Introduce changes proposed by @cryshado alongside with lowering the number of PBKDF2 iterations; to preserve backwards compatibility, similarly how we check for wallet versions, we will check if legacy salt and number of iterations would result in an already active wallet address.
EmelyanenkoK commented 2 years ago

Using word salt instead of seed in this discussion is misleading. Both uniqueness of the seed (using 'TON default seed' instead of 'mnemonic') and high number of PBKDF2 iterations serve the same purpose of making brute-force a hard problem. Mnemonic which consists of 24 words from set of 2047 words gives 24*11=264 bits of entropy (note that since keys are 256bit long additional 8bit are somewhat useless from the randomness stand of the point). isBasicSeed and isPasswordSeed fixes the first byte of the seed and thus take 8 bits from those 264. That way generation of the seed becomes 256 times slower, we lose nothing from the randomness stand of the point, but get more error-prone mnemonics: password-less and password mnemonics do not mix, and we can show or hide password input field for user automatically.

Changing mnemonic generation algorithm is a serious and cumbersome process. It can be justified in case of security issue, but not form aesthetic reason.

ilyar commented 1 year ago

I would like to respectfully point out that the implementation of the mnemonic generation process seems to have been done incorrectly. The current approach appears to involve the random selection of 24 words from a dictionary, which are then used to create a seed. (source) However, the correct method would be to first generate entropy and then derive the words based on this entropy. The seed should also be obtained using the generated entropy.

image

immune detail fiscal glimpse grunt vicious car dust swap panther motion undo relax that iron into offer industry empower phrase object poverty junk output

Use the BIP39 standard for recover:

{
  bitOfWord: 11,
  wordCount: 24,
  bitChecksumCount: 8,
  bitEntropyCount: 256,
  byteCount: 32,
  entropy: Uint8Array(32) [
    113, 167, 137,  94,  49, 150, 115, 231,
      8, 146,  34, 219,  83, 250,  65, 118,
    123,  81, 191, 221, 147, 174, 153, 110,
    101,  36,  81, 233, 129,  82,  30,  84
  ],
  hexEntropy: '71a7895e319673e7089222db53fa41767b51bfdd93ae996e652451e981521e54',
  hexChecksum: '6f',
  bitEntropy: '011100011010011110001001010111100011000110010110011100111110011100001000100100100010001011011011010100111111101001000001011101100111101101010001101111111101110110010011101011101001100101101110011001010010010001010001111010011000000101010010000111100101010001101111',
  bitChecksum: '01101111',
  wordIndexList: [
     909,  482,  700,  793,  825,
    1948,  274,  546, 1754, 1278,
    1154, 1895, 1448, 1791,  946,
     942, 1227,  921,  584, 1310,
    1216, 1352,  970, 1135
  ],
  wordList: [
    'immune',  'detail', 'fiscal',
    'glimpse', 'grunt',  'vicious',
    'car',     'dust',   'swap',
    'panther', 'motion', 'undo',
    'relax',   'that',   'iron',
    'into',    'offer',  'industry',
    'empower', 'phrase', 'object',
    'poverty', 'junk',   'mistake'
  ]
}

Please note that there is a crucial difference in the last word of the phrase when attempting to use the BIP39 standard for entropy recovery from a phrase generated by the current algorithm.

When generating 24 random words, the entropy checksum is not calculated correctly. That is why it is essential to first generate entropy, add the checksum, and then derive the corresponding words from the dictionary based on the resulting data. Following this correct approach will ensure compatibility with the BIP39 standard and improve the overall reliability of the mnemonic generation process.

solution

A possible solution to this issue could be the implementation of entropy generation according to the standard, and using a trigger mechanism to determine the algorithm for secret recovery based on the compliance of the input mnemonic with the BIP39 standard. If the mnemonic does not adhere to the standard, the current, incorrect algorithm can be used for recovery; otherwise, the standard algorithm should be applied.

Additionally, it would be beneficial to include the option for users to derive keys using a specific path, such as m/44'/607'/0'/0/0, to further enhance the functionality and flexibility of the system.

EmelyanenkoK commented 1 year ago

I don't see much merit in being "just" compatible with other networks:

Thus, in contrast with some other blockchains where compatibility means access to whole ecosystem of tools and software, TON gains next to nothing from the compatibility of its key with the standard of other networks: developers who are reluctant to make minor adjustments (or use ready implementations) here will never implement (de)serialization, signing algo, checking message processing in the network (all very different from other networks).

What are cons of changing algo?

We still need HDKD, though, probably we can use compatible implementation of that.

ilyar commented 1 year ago

While I understand your perspective on the limited benefits of compatibility with other networks and the unique aspects of TON, I would like to emphasize that adopting widely-accepted cryptographic standards offers more advantages than simply facilitating interoperability.

  1. Security: Adhering to established standards reduces the risk of implementing insecure or error-prone cryptographic algorithms. Cryptography is a highly specialized field, and creating custom algorithms without thorough peer review and testing can lead to vulnerabilities that may compromise the security of the entire system.
  2. Trust: Using industry-standard cryptography can increase the trustworthiness of the system in the eyes of potential users and developers. It demonstrates a commitment to best practices and ensures that the platform is built upon proven and secure foundations.
  3. Easier integration with existing tools: Although TON may have unique aspects, compatibility with established standards can still simplify integration with existing tools, libraries, and frameworks, reducing the learning curve and development time for developers working with the platform.
  4. Future-proofing: As the blockchain ecosystem evolves, compatibility with widely-accepted standards will become more critical. By adopting these standards now, you can ensure that your platform is better prepared for future developments and potential collaborations.

In conclusion, the goal of adopting established cryptographic standards is not merely to be compatible with other ecosystems but to provide a solid foundation of security, trust, and future-proofing for TON. While it may require some adjustments and additional work, the long-term benefits of adhering to proven standards far outweigh the potential drawbacks.

I would like to point out that my proposal does not involve changing the number of PBKDF2 iterations currently used in TON, which is set at 100,000, deviating from standard values and purportedly increasing security. While the effectiveness of this increased iteration count can be debated, my primary goal is focused on standardizing the entropy recovery process.

By adhering to a standard entropy recovery method, we can maintain the current PBKDF2 iteration count as a parameter for the existing cryptographic processes without disrupting the system. This approach allows for compatibility with industry standards while preserving TON's unique security measures.

Additionally, adopting a standard for private key derivation can further enhance our system's interoperability and compatibility with other tools and frameworks. This change would provide users and developers with a more familiar and widely accepted method for managing their keys, without sacrificing the platform's distinct features.

In summary, my proposal aims to strike a balance between preserving TON's unique security measures, such as the increased PBKDF2 iteration count, and adhering to industry standards for entropy recovery and key derivation. This approach can foster trust, security, and compatibility, while still retaining the platform's distinct characteristics.

ex3ndr commented 1 year ago

This "industry standards" picked a curve that is not fully secure and continues to use it.


From: Ilyar @.> Sent: Wednesday, April 19, 2023 11:42 AM To: toncenter/tonweb-mnemonic @.> Cc: Steve Korshakov @.>; Comment @.> Subject: Re: [toncenter/tonweb-mnemonic] Very bad practice with pbkdf2Sha512 salt (Issue #13)

While I understand your perspective on the limited benefits of compatibility with other networks and the unique aspects of TON, I would like to emphasize that adopting widely-accepted cryptographic standards offers more advantages than simply facilitating interoperability.

  1. Security: Adhering to established standards reduces the risk of implementing insecure or error-prone cryptographic algorithms. Cryptography is a highly specialized field, and creating custom algorithms without thorough peer review and testing can lead to vulnerabilities that may compromise the security of the entire system.
  2. Trust: Using industry-standard cryptography can increase the trustworthiness of the system in the eyes of potential users and developers. It demonstrates a commitment to best practices and ensures that the platform is built upon proven and secure foundations.
  3. Easier integration with existing tools: Although TON may have unique aspects, compatibility with established standards can still simplify integration with existing tools, libraries, and frameworks, reducing the learning curve and development time for developers working with the platform.
  4. Future-proofing: As the blockchain ecosystem evolves, compatibility with widely-accepted standards will become more critical. By adopting these standards now, you can ensure that your platform is better prepared for future developments and potential collaborations.

In conclusion, the goal of adopting established cryptographic standards is not merely to be compatible with other ecosystems but to provide a solid foundation of security, trust, and future-proofing for TON. While it may require some adjustments and additional work, the long-term benefits of adhering to proven standards far outweigh the potential drawbacks.

I would like to point out that my proposal does not involve changing the number of PBKDF2 iterations currently used in TON, which is set at 100,000, deviating from standard values and purportedly increasing security. While the effectiveness of this increased iteration count can be debated, my primary goal is focused on standardizing the entropy recovery process.

By adhering to a standard entropy recovery method, we can maintain the current PBKDF2 iteration count as a parameter for the existing cryptographic processes without disrupting the system. This approach allows for compatibility with industry standards while preserving TON's unique security measures.

Additionally, adopting a standard for private key derivation can further enhance our system's interoperability and compatibility with other tools and frameworks. This change would provide users and developers with a more familiar and widely accepted method for managing their keys, without sacrificing the platform's distinct features.

In summary, my proposal aims to strike a balance between preserving TON's unique security measures, such as the increased PBKDF2 iteration count, and adhering to industry standards for entropy recovery and key derivation. This approach can foster trust, security, and compatibility, while still retaining the platform's distinct characteristics.

— Reply to this email directly, view it on GitHubhttps://github.com/toncenter/tonweb-mnemonic/issues/13#issuecomment-1514514903, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AADB2E23O2YD4JI25THT5CLXB66QNANCNFSM5NC7KLHA. You are receiving this because you commented.Message ID: @.***>