cosmos / cosmos-sdk

:chains: A Framework for Building High Value Public Blockchains :sparkles:
https://cosmos.network/
Apache License 2.0
6.24k stars 3.6k forks source link

EPIC: remove global bech32 #13140

Closed tac0turtle closed 4 months ago

tac0turtle commented 2 years ago

Summary

Currently bech32 prefixes is a global in the cosmos-sdk and for applications as well. We should aim to remove most if not all globals within the sdk.

Problem Definition

Globals should be removed.

Work Breakdown

Notes:

Is there any other work we should mention here?

ref https://github.com/cosmos/cosmos-sdk/issues/8332

faddat commented 2 years ago

@marbar3778 this is an epic epic. I am a huge fan.

if I get time might look at closing it or seeing if we've got someone at Notional feeling up to the task.

itsdevbear commented 1 year ago

app.ModuleBasics another global.

alexanderbez commented 1 year ago

ModuleBasics is fine IMO

robert-zaremba commented 1 year ago

In https://github.com/cosmos/cosmos-sdk/issues/1533 I read about account.Codec. Copying here my comment about reusing codec.Codec for AccAddress.

We can combine account.Codec with codec.Codec. We could have a rule there for AccAddress type (as well for other types, VaAddress).

itsdevbear commented 1 year ago

How do we feel about deprecating bech32 entirely and just using byte addresses (maybe EIP-55), whats the advantage to the human readable prefix bech32, many nontechnical users find it confusing that their address changes when they switch chains and is a huge confusion point from many members in our community.

yihuang commented 1 year ago

Bech32 could be a purely client-side or rpc side thing, no need to exist in chain state I think. We can use raw bytes in state, but convert to bech32 in rpc.

tac0turtle commented 1 year ago

Personally i think we should kill accaddress type as well. It should all be bytes. The decode and encode are handled by the address codec. We shouldnt combine it with the codec.codec since they do two separate things and some modules dont need codec.codec.

Bech32 or any address encoding is only an client layer thing, it shouldnt be used in state

alexanderbez commented 1 year ago

Bech32 could be a purely client-side or rpc side thing, no need to exist in chain state I think. We can use raw bytes in state, but convert to bech32 in rpc.

This is already the case. Bech32 isn't used in the state-machine. Modules store bytes, not strings. Bech32 is used for clients and tests primarily.

aaronc commented 1 year ago

It is used in the state machine in so much as transaction addresses are encoded in bech32 and need to be decoded in modules. I agree with @yihuang that we should evaluate making it client side only, but as an option. Removing it from the state machine entirely would be too breaking for clients. But we could add support in sign mode textual for rendering bytes addresses as bech32. Any thoughts @amaurym ?

alexanderbez commented 1 year ago

That only happens in message validation, the state machine does not rely on any string interpretations.

aaronc commented 1 year ago

That only happens in message validation, the state machine does not rely on any string interpretations.

Message validation is part of the state machine. But in addition, all msg servers decode bech32 addresses. Just look at any msg server implementation - even the simplest bank msgServer.Send. It calls sdk.AccAddressFromBech32 twice

alexanderbez commented 1 year ago

Yes, but they're never read or written to/from storage is what I meant.

aaronc commented 1 year ago

Yes they are usually not used in storage, but some chains might store them.

yihuang commented 1 year ago

Bech32 could be a purely client-side or rpc side thing, no need to exist in chain state I think. We can use raw bytes in state, but convert to bech32 in rpc.

This is already the case. Bech32 isn't used in the state-machine. Modules store bytes, not strings. Bech32 is used for clients and tests primarily.

There are several sdk modules store bech32 into the state, I did a search in the change set file:

$ pigz -d -c staking/block-7000001.zz | strings | grep crc | head -n 1
1crcvaloper1efw0q0ggzxtmuf80wpcgr77h70c3avpdt6zv2l
$ pigz -d -c acc/block-7000001.zz | strings | grep crc | head -n 1
*crc1qgjqzzaz64nllgq5yghdjcx3lfpm3j8p2cy428
$ pigz -d -c slashing/block-7000001.zz | strings | grep crc | head -n 1
1crcvalcons1r6svmr6jnkt27mjzv3ggs2cj5wcfurnz2kuaxl
$ pigz -d -c feegrant/block-7000001.zz | strings | grep crc | head -n 1
*crc1gp7w5j69nzrrsnnjx0938gz5w74lkn3zdh7axr
alexanderbez commented 1 year ago

I both stand corrected and am very disappointed. Thanks for pointing that out @yihuang!

tac0turtle commented 1 year ago

dam this was my worry, we will have to approach this issue based on where it is used. Some places like staking could only be removed in a staking v2 design.

alexanderbez commented 1 year ago

We'll need migrations then. Which shouldn't be hard. Just take string to bytes. Still a bummer though.

aaronc commented 1 year ago

Why is it a problem that they're stored in state though? It doesn't affect our ability to remove the global.

tac0turtle commented 1 year ago

We'll need migrations then. Which shouldn't be hard. Just take string to bytes. Still a bummer though.

main issue is how long migrations would take, but something we can test.

Why is it a problem that they're stored in state though? It doesn't affect our ability to remove the global.

if a user wants to use something other than bech32 then they still need to use bech32 for this layer.

aaronc commented 1 year ago

if a user wants to use something other than bech32 then they still need to use bech32 for this layer.

as long as the modules storing string addresses in state use address.Codec, the encoding should be swappable on a per-chain basis

alexanderbez commented 1 year ago

Why is it a problem that they're stored in state though? It doesn't affect our ability to remove the global.

Why would you store bech32 strings in state as opposed to their true raw byte? My statement had nothing to do with removing global bech32. It was a standalone statement of not storing bech32 in state in general 👍

aaronc commented 1 year ago

Why is it a problem that they're stored in state though? It doesn't affect our ability to remove the global.

Why would you store bech32 strings in state as opposed to their true raw byte? My statement had nothing to do with removing global bech32. It was a standalone statement of not storing bech32 in state in general 👍

Agree it's bad, just was stating it's orthogonal

tac0turtle commented 1 year ago

agree with both of you. you two are saying the same thing basically. Agreement that the current is not ideal

alexanderbez commented 1 year ago

We should probably team on this and tackle components piecemeal :)

itsdevbear commented 1 year ago

A note aside I was proposing to @tac0turtle that we replace bech32 with a hexidemical output similar to ethereum.

Will create a more familiar UX for users.

aaronc commented 1 year ago

There is a very good reason we use bech32 and not hex in cosmos - because it prevents you from sending coins to the wrong chain when sharing addresses. I think it was wise choice

itsdevbear commented 1 year ago

How is osmosis:0xABCDEF any worse? Like what gnosis does

yihuang commented 1 year ago

A note aside I was proposing to @tac0turtle that we replace bech32 with a hexidemical output similar to ethereum.

Will create a more familiar UX for users.

I think security-wise different chains are supposed to use different coin types in HD wallet, in that case the address won't be the same cross-chain anyway. Ethereum-like chains choose to use same coin type as Ethereum only for user adoption.

aaronc commented 1 year ago

How is osmosis:0xABCDEF any worse? Like what gnosis does

As long as osmosis: is a protocol-level part of the address OR strongly enforced by all clients, it would have the same benefits

aaronc commented 1 year ago

I think security-wise different chains are supposed to use different coin types in HD wallet, in that case the address won't be the same cross-chain anyway. Ethereum-like chains choose to use same coin type as Ethereum only for user adoption.

Having a different coin type would only make matters worse. The address has no knowledge of which pubkey generated it and this cannot be verified at the protocol level. HD derivation happens at the level of deriving the public key. So if there is a different coin type, it means that if you send coins to that address on the wrong chain, by default your wallet UI would not generate a public key that is compatible with the wrong address on that chain becuse the coin type is different. The chain itself will not reject the transaction because at the address level the coin type is unknown. Bech32, however, prevents this because sending coins to an osmosisxyz12345 address on cosmos hub will result in a failed transaction.

KyleMoser commented 1 year ago

@tac0turtle I am working on feegrants for the relayer and this issue is a blocker for me. A lot of times relayer integration tests fail because of non-determinism due to shared bech32 addresses.

I started fixing this issue in v0.47.3 by introducing the following:

// Allows SDK clients to use multiple configurations. Key = name, value = *Config
sdkConfigs sync.Map
DefaultConfig = "default"

and

// GetConfig returns the named config instance for the SDK
func GetConfig(name string) *Config {
    conf, _ := sdkConfigs.LoadOrStore(name, NewConfig())
    return conf.(*Config)
}

This would allow almost all current code to work easily by retrieving the default config. Similar to what @jackzampolin indicated in a comment, the Context could be used to pass the Config name if desired, or if not specified, it would use the default.

My approach seems different than what you are doing but as I am sort of blocked, are you open to discussing how I can continue your work on this issue?

Also tagging @agouin

JulianToledano commented 5 months ago

Hey guys!!

A lot of String calls have been removed from x packages. Some interfaces may need to be refactored to remove internal calls to it, such as ValidateBasic and I'm still looking for any calls to Stringer that may have been missed.

However, my current concern is the String method, has a cache for Bech32 addresses. I think it makes sense to create an address codec with a cache to maintain performance.

@tac0turtle

tac0turtle commented 5 months ago

yea the cache makes sense. for the breaking of interfaces i think we can leave them on the global for now. We have plans on changing things in the future so would prefer to avoid breaking things for users

tac0turtle commented 4 months ago

closing this, the cache is still being worked on but this is a nice to have performance improvement