Open ilblackdragon opened 4 years ago
@frol @evgenykuzyakov just checking in if we want to execute on this before MainNet or will postpone
@ilblackdragon Let's start with the problem. I assume the problem is that fixed-length values encode into a non-fixed-length base58 string.
I am against having "clever" custom encodings since it casts away existing tools out there.
Alternatives:
+/=
in its alphabet, which makes it problematic in the URL environmentsSplitting an input into 8-byte chunks and encoding each of them separately (using fixed-length base58) is a simple approach that solves two major problem of base58: its variable length and the quadratic complexity of its encoding and decoding (which can be optimized, but this is not easy).
This approach still has some issues. For example, encoding requires 64-bit divisions, which are difficult to do in some environments (e.g. JavaScript). I have some ideas about a fixed-length encoding that is almost as efficient (length-wise) as base58 and can be efficiently encoded and decoded without using 64-bit arithmetic.
Another approach is to use some variant of base32 encoding. Base32 encoding is about 17% longer than base58, but it only uses letters of a single case. If we want an encoding that can be easily written down by humans, this is a clear advantage. Some variants, such as Crockford's, are specifically designed to be easy to write and pronounce.
I want to highlight that we use base58 only to display hashes, i.e. sha256, that is 256 bits (32 bytes) of data, which encodes into:
We also use base58 for keys, not just for hashes. Keys that we use are also 32 bytes (but I think that we currently encode Ed25519 private keys as 64 bytes, which is a bug).
In base58, it is 32-44 characters. In base32 (without padding), it is 52 characters.
However, this doesn't include a checksum. With 4-byte checksum, it will be:
Would we consider using base64 for the keys?
Pros:
Cons:
+/=
which may cause problems with URL encoding. It does not seem to be important to me in the context of keys since you don't want using keys in the URL anyway, do you?Public keys are used in URLs.
The motivation for base58 was ease of manual entry. If we don't care about that, we can use URL-safe variant of base64.
I wonder if anyone ever entered base58 / uuid / base64 value longer than 20 (yet 44+) chars manually... :smiley:
We should not be creating our own encoding scheme, like the suggested modification of b58. This will create a lot fo confusion and will be a major source of bugs for people trying to integrate their tools with our RPC.
There is no point for optimizing for the number of bytes, since we are using JSON and our RPC responses are hundreds or even thousands of characters long even for simple requests due to all the JSON boilerplate and the names of the fields, see https://docs.nearprotocol.com/docs/interaction/rpc
{
"receipts_outcome": [
{
"block_hash": "8xr1UHftKtth1HErsJ5SEHBgVpbHqyWHQVWeXnT78RfV",
"id": "Aei6qE1jXnoXA81WvDk69kc8HfPV8kZTRgte38nHV6j8",
"outcome": {
"gas_burnt": 2389161082329,
"logs": [
"Writing the string [ hello ] to the blockchain ..."
],
"receipt_ids": [
"Gus1JVoJgBaccBtzHApDs22k6A5CZ8vEgRWHmEhVfQCs"
],
"status": {
"SuccessValue": ""
}
},
"proof": []
},
{
"block_hash": "11111111111111111111111111111111",
"id": "Gus1JVoJgBaccBtzHApDs22k6A5CZ8vEgRWHmEhVfQCs",
"outcome": {
"gas_burnt": 0,
"logs": [],
"receipt_ids": [],
"status": "Unknown"
},
"proof": []
}
],
"status": {
"SuccessValue": ""
},
"transaction": {
"actions": [
{
"FunctionCall": {
"args": "eyJhcGlSZXNwb25zZSI6ImhlbGxvIn0=",
"deposit": "0",
"gas": 100000000000000,
"method_name": "setResponse"
}
}
],
"hash": "7RnMJsHMVMUbArUdKcMaSBPPomjLzQxzhsuvwP3FXGyi",
"nonce": 5,
"public_key": "ed25519:4dYj4upxVxeaiQtV1EB5NvUjuK2axytiAaChr6xNCSgp",
"receiver_id": "dev-jdvw47f9j",
"signature": "ed25519:3UrQcvvjuGSWM2w5sQ5GNZYBVWxRMjFbFp1cPg59Qg1rL4PtWMi5s7HY7cgSdx5PB17aqX1nrjiUiK7xxxtBeu6M",
"signer_id": "ajax"
},
"transaction_outcome": {
"block_hash": "A7NDYejEMJ5RNTdcVuoYDTezruYSh8QKPrVK4TkcUnqy",
"id": "7RnMJsHMVMUbArUdKcMaSBPPomjLzQxzhsuvwP3FXGyi",
"outcome": {
"gas_burnt": 2291572068402,
"logs": [],
"receipt_ids": [
"Aei6qE1jXnoXA81WvDk69kc8HfPV8kZTRgte38nHV6j8"
],
"status": {
"SuccessReceiptId": "Aei6qE1jXnoXA81WvDk69kc8HfPV8kZTRgte38nHV6j8"
}
},
"proof": []
}
}
{
"receipts_outcome": [
{
"block_hash": "76502aa865bf709e56b5856fb2bb082a97d5d187ad98d6dac1d7fc7057320f10",
"id": "8f622511e512206b3e3b43904ef015281dc983ef971816ab8bbd70839db4757f",
"outcome": {
"gas_burnt": 2389161082329,
"logs": [
"Writing the string [ hello ] to the blockchain ..."
],
"receipt_ids": [
"ec6a0c0a0c895a2b4b592e34bfd4473c358f8e53b05252f281c68b322905eadc"
],
"status": {
"SuccessValue": ""
}
},
"proof": []
},
{
"block_hash": "0000000000000000000000000000000000000000000000000000000000000000",
"id": "ec6a0c0a0c895a2b4b592e34bfd4473c358f8e53b05252f281c68b322905eadc",
"outcome": {
"gas_burnt": 0,
"logs": [],
"receipt_ids": [],
"status": "Unknown"
},
"proof": []
}
],
"status": {
"SuccessValue": ""
},
"transaction": {
"actions": [
{
"FunctionCall": {
"args": "039377254c90bc33963184c3a8167da70438e3",
"deposit": "0",
"gas": 100000000000000,
"method_name": "setResponse"
}
}
],
"hash": "5f7f56aaa51736a77872b9fe291ca320611383d7e8bec68031817e6714e2da45",
"nonce": 5,
"public_key": "ed25519:35efd78157db6dda96a24988e60de9e7f5c1fefcaa246e87d930d2beda020cb1",
"receiver_id": "dev-jdvw47f9j",
"signature": "ed25519:7c0d6861845ae00e29fe50ebf449fa2eab2ae5d2b0959b57ded4024ab8ae82db7b455b744c91f293f24ad559d2301857d804bc5a7dff4530e7580c9d1a47d20e",
"signer_id": "ajax"
},
"transaction_outcome": {
"block_hash": "875aa81349e154b01f53314e4ddabd9e920f0a5c36be2b9ffb291538f1da07b4",
"id": "5f7f56aaa51736a77872b9fe291ca320611383d7e8bec68031817e6714e2da45",
"outcome": {
"gas_burnt": 2291572068402,
"logs": [],
"receipt_ids": [
"8f622511e512206b3e3b43904ef015281dc983ef971816ab8bbd70839db4757f"
],
"status": {
"SuccessReceiptId": "8f622511e512206b3e3b43904ef015281dc983ef971816ab8bbd70839db4757f"
}
},
"proof": []
}
}
It is a 15% increase in size if we are switching to hex. I suggest we pick hex, which is the most common and readable format (we are not displaying full hashes and keys in our UI anyway) and the difference in bytes is negligible. Anything but hex would be a micro optimization.
AFAIU currently our RPC can be trivially abused by submitting large b58 encoded keys. I suggest we add a simple threshold cut-off in RPC that checks that b58-encoded key is <45 characters or returns deserialization error. We then switch to hex post-Mainnet since it is would require lots of changes across our Applayer.
Unassigning @evgenykuzyakov since I don't think he is working on it. Also changing the title of the issue to reflect the broader scope of the conversation.
Now even Bitcoin is moving away from base58. Bitcoin Core 0.20, which was released today, uses Bech32 (a base32-based format) as default address format.
What are the objections to postponing switching to another encoding post Phase I?
Do we have to decode/encode base58 today in any environment in which the quadratic complexity matters (e.g. do we do any b58 decoding in EVM)?
@SkidanovAlex I'm not saying that we need to do it quickly.
Is this even possible at this point? :D
This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
@frol what is the status of this issue?
There is no action item here. Feel free to re-open if necessary
I think an action item here is to stop using base58 in any new protocols and APIs, and to deprecate (and eventually remove) any protocols and APIs that are using it.
This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
@abacabadabacaba suggested to use an improved bs58 scheme that encodes into fixed length representation by splitting sequence into 8 byte chunks.
Additionally to add check sum to verify that the full sequence was provided and wasn't corrupted.
Interesting note that Ethereum didn't have this originally and they added it by using capitalization of letters in base16.