cosmos / cosmos-sdk

:chains: A Framework for Building High Value Public Blockchains :sparkles:
https://cosmos.network/
Apache License 2.0
6.11k stars 3.53k forks source link

math: benchmark big.INT operations #15029

Open tac0turtle opened 1 year ago

tac0turtle commented 1 year ago

Summary

in math we use Big.INT for larger than integers. We never truly benchmarked the implementation against something like https://github.com/decred/dcrd/blob/master/math/uint256/README.md or rolling our own implementation.

Proposal

Benchmark big.INT against a library like dcreds, if we identify that big.INT operations are slow we should come up with a plan with how to optimise the operations

ValarDragon commented 1 year ago

The biggest issue with the current implementation is the truly excessive amount of heap allocation we do. Barely anything is kept on the stack, due to the pointer indirection involved, and amount of new ints allocated.

In place operations help significantly here. Also having more internal flexibility / optimization on the max size bound is likely important.

To compare to something fixed precision, e.g. uint256, you should compare performance for single word sized arithmetic, and 4 word arithmetic. Ideally what we'd see is that for single word in-place operations, the current outperforms theres. (If not, theres a problem) Then we could think of tricks to get these to not be new heap allocations in the average case.

We'd certainly see that 4 word in-place arithmetic in the current approach is slower than theirs. (Somewhat fundamental)

Ideally we'd see generics, like https://github.com/arkworks-rs/algebra/tree/master/ff/src/biginteger be used to allow flexibility between word size, while maintaining high perf. (With the edge case of <= 4 words always outperforming without generics overriding that parameter)

alexanderbez commented 1 year ago

IIRC, Go will allocate objects onto the heap under 3 primary conditions:

Given this, what ideas do you have to ensure we use the stack more effectively?

tac0turtle commented 1 year ago

we should write some benchmarks and identify hot spots in relation to our implementation. This will probably lead to the best first steps

ValarDragon commented 1 year ago

Should get reproducable benchmarks to verify this, but my experience with prior benchmarks is that the is mostly all the heap allocation and GC overhead work. The actually big.Int performance is pretty small compared to the time spent in heap allocating just to create an object. The mutative operations getting used internally is at least a significant aid in this.

Every object here has an address taken, because we use pointers to bigints, and the bigint itself contains a slice underneath. The underlying slice is potentially small vec optimized onto stack: https://medium.com/@yulang.chu/go-stack-or-heap-2-slices-which-keep-in-stack-have-limitation-of-size-b3f3adfd6190 , but the pointer indirection in the Int & Decimal is not. We see this in cosmos math heavy workloads causing massive amount of heap allocs

Theres a few big directions from the heap claim, for speed improvements:

itsdevbear commented 1 year ago

@ValarDragon @tac0turtle okay hear me out: https://github.com/holiman/uint256

image

It would make sdkmath and sdkcoins speedy fast. No-one needs anything bigger than 256 and we could always create our own custom bigInt that is 2 uint256's if someone REALLY needs something larger than 256.

tac0turtle commented 1 year ago

it is worth considering doing something else since

Another reason to consider this is that math/big is not used for security-sensitive cryptography in Go anymore,
 so we might not treat certain bugs in it as security vulnerabilities going forward. 

https://go.dev/doc/go1.20#math/big

ValarDragon commented 1 year ago

@tac0turtle the context for that go 1.20 announcement AFAIU is:

alpha-omega-labs commented 11 months ago

Might we have related issue: trying to import genesis.json and also validate it and got this error:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x141a438]

goroutine 1 [running]:
math/big.(*Int).Sign(...)
        math/big/int.go:41
github.com/cosmos/cosmos-sdk/types.Dec.IsNegative(...)

And

initializing blockchain state from genesis.json
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x141a438]

goroutine 1 [running]:
math/big.(*Int).Sign(...)
        math/big/int.go:41
github.com/cosmos/cosmos-sdk/types.Dec.IsNegative(...)
        github.com/cosmos/cosmos-sdk@v0.46.15-0.20230807104542-537257060180/types/decimal.go:211
linhvt-teko commented 8 months ago

Same with @alpha-omega-labs I've imported genesis.json from Polkachu or https://raw.githubusercontent.com/cosmos/mainnet/master/genesis/genesis.cosmoshub-4.json.gz After that, I got this error:

10:05AM INF starting node with ABCI Tendermint in-process
10:05AM INF service start impl=multiAppConn module=proxy msg={}
10:05AM INF service start connection=query impl=localClient module=abci-client msg={}
10:05AM INF service start connection=snapshot impl=localClient module=abci-client msg={}
10:05AM INF service start connection=mempool impl=localClient module=abci-client msg={}
10:05AM INF service start connection=consensus impl=localClient module=abci-client msg={}
10:05AM INF service start impl=EventBus module=events msg={}
10:05AM INF service start impl=PubSub module=pubsub msg={}
10:05AM INF service start impl=IndexerService module=txindex msg={}
10:05AM INF ABCI Handshake App Info hash= height=0 module=consensus protocol-version=0 software-version=v13.0.2
10:05AM INF ABCI Replay Blocks appHeight=0 module=consensus stateHeight=0 storeHeight=0
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x1204806]

goroutine 1 [running]:
math/big.(*Int).Sign(...)
    math/big/int.go:41
github.com/cosmos/cosmos-sdk/types.Dec.IsNegative(...)
    github.com/cosmos/cosmos-sdk@v0.45.16/types/decimal.go:211
github.com/cosmos/cosmos-sdk/x/staking/types.validateValidatorBondFactor({0x1d0b1e0?, 0x0?})
    github.com/cosmos/cosmos-sdk@v0.45.16/x/staking/types/params.go:260 +0x46
github.com/cosmos/cosmos-sdk/x/params/types.Subspace.SetParamSet({{0x25f0a90, 0xc0010795c0}, 0xc000139678, {0x25ce740, 0xc000c76cf0}, {0x25ce790, 0xc000c76db0}, {0xc0005b5260, 0x7, 0x20}, ...}, ...)
    github.com/cosmos/cosmos-sdk@v0.45.16/x/params/types/subspace.go:245 +0x22d
github.com/cosmos/cosmos-sdk/x/staking/keeper.Keeper.SetParams(...)
    github.com/cosmos/cosmos-sdk@v0.45.16/x/staking/keeper/params.go:84
github.com/cosmos/cosmos-sdk/x/staking.InitGenesis({{0x25e2c78, 0xc000054088}, {0x25f0670, 0xc00057e5c0}, {{0x0, 0x0}, {0xc001006000, 0xb}, 0x0, {0x0, ...}, ...}, ...}, ...)
    github.com/cosmos/cosmos-sdk@v0.45.16/x/staking/genesis.go:35 +0x265
github.com/cosmos/cosmos-sdk/x/staking.AppModule.InitGenesis({{{0x25f62f8, 0xc0010795c0}}, {{0x25ce740, 0xc000c76ca0}, {0x25f0a90, 0xc0010795c0}, {0x25e66a0, 0xc000cb0480}, {0x7ff1e6bfd200, 0xc0001758c0}, ...}, ...}, ...)
    github.com/cosmos/cosmos-sdk@v0.45.16/x/staking/module.go:157 +0x146
github.com/cosmos/cosmos-sdk/types/module.(*Manager).InitGenesis(_, {{0x25e2c78, 0xc000054088}, {0x25f0670, 0xc00057e5c0}, {{0x0, 0x0}, {0xc001006000, 0xb}, 0x4f5b97, ...}, ...}, ...)
    github.com/cosmos/cosmos-sdk@v0.45.16/types/module/module.go:327 +0x23d
github.com/cosmos/gaia/v13/app.(*GaiaApp).InitChainer(_, {{0x25e2c78, 0xc000054088}, {0x25f0670, 0xc00057e5c0}, {{0x0, 0x0}, {0xc001006000, 0xb}, 0x4f5b97, ...}, ...}, ...)
    github.com/cosmos/gaia/v13/app/app.go:252 +0x20e
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).InitChain(0xc000c7fc00, {{0x0, 0xed5830c36, 0x0}, {0xc001006000, 0xb}, 0xc0005b08e0, {0xc000c53c00, 0x7d, 0x7d}, ...})
    github.com/cosmos/cosmos-sdk@v0.45.16/baseapp/abci.go:63 +0x455
github.com/tendermint/tendermint/abci/client.(*localClient).InitChainSync(0xc0005af260, {{0x0, 0xed5830c36, 0x0}, {0xc001006000, 0xb}, 0xc0005b08e0, {0xc000c53c00, 0x7d, 0x7d}, ...})
    github.com/tendermint/tendermint@v0.34.27/abci/client/local_client.go:272 +0x118
github.com/tendermint/tendermint/proxy.(*appConnConsensus).InitChainSync(0xc000510f00?, {{0x0, 0xed5830c36, 0x0}, {0xc001006000, 0xb}, 0xc0005b08e0, {0xc000c53c00, 0x7d, 0x7d}, ...})
    github.com/tendermint/tendermint@v0.34.27/proxy/app_conn.go:77 +0x55
github.com/tendermint/tendermint/consensus.(*Handshaker).ReplayBlocks(_, {{{0xb, 0x0}, {0xc000cdf348, 0x7}}, {0xc000cdf390, 0xb}, 0x4f5b97, 0x0, {{0x0, ...}, ...}, ...}, ...)
    github.com/tendermint/tendermint@v0.34.27/consensus/replay.go:319 +0xf18
github.com/tendermint/tendermint/consensus.(*Handshaker).Handshake(0xc0137f5c70, {0x25f34c0, 0xc0011524e0})
    github.com/tendermint/tendermint@v0.34.27/consensus/replay.go:268 +0x3d4
github.com/tendermint/tendermint/node.doHandshake({_, _}, {{{0xb, 0x0}, {0xc000cdf348, 0x7}}, {0xc000cdf390, 0xb}, 0x4f5b97, 0x0, ...}, ...)
    github.com/tendermint/tendermint@v0.34.27/node/node.go:329 +0x1b8
github.com/tendermint/tendermint/node.NewNode(0xc0000f0a00, {0x25df960, 0xc000524960}, 0xc000c7b690, {0x25c9860, 0xc000134e40}, 0x0?, 0x0?, 0xc000c7b8b0, {0x25e3a08, ...}, ...)
    github.com/tendermint/tendermint@v0.34.27/node/node.go:779 +0x597
github.com/cosmos/cosmos-sdk/server.startInProcess(_, {{0x0, 0x0, 0x0}, {0x25ffc80, 0xc001168630}, {0x0, 0x0}, {0x25e7f58, 0xc0010795c0}, ...}, ...)
    github.com/cosmos/cosmos-sdk@v0.45.16/server/start.go:280 +0x89b
github.com/cosmos/cosmos-sdk/server.StartCmd.func2(0xc0010e2c00?, {0x34dff30?, 0x0?, 0x0?})
    github.com/cosmos/cosmos-sdk@v0.45.16/server/start.go:128 +0x169
github.com/spf13/cobra.(*Command).execute(0xc0010e2c00, {0x34dff30, 0x0, 0x0})
    github.com/spf13/cobra@v1.7.0/command.go:940 +0x862
github.com/spf13/cobra.(*Command).ExecuteC(0xc000e36f00)
    github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
    github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
    github.com/spf13/cobra@v1.7.0/command.go:985
github.com/cosmos/cosmos-sdk/server/cmd.Execute(0x0?, {0xc00013df38, 0x12})
    github.com/cosmos/cosmos-sdk@v0.45.16/server/cmd/execute.go:36 +0x1eb
main.main()
    github.com/cosmos/gaia/v13/cmd/gaiad/main.go:16 +0x2c

I'm working on Gaia v13.0.2 which downloaded from https://github.com/cosmos/gaia/releases/tag/v13.0.2 (I've tried to clone source and build manually and still not working)

angusscott commented 7 months ago

@linhvt-teko did you ever resolve/overcome this issue? I see something very similar

tunglinhvu commented 7 months ago

@angusscott I've no idea about this problem. So I reinstalled everything and downloaded a snapshot from Polkachu. You can try these steps:

  1. reinstalled Gaia v13.0.2 from scratch, init moniker...
  2. wget genesis file json from https://snapshots.polkachu.com/genesis/cosmos/genesis.json and mv to gaia config
  3. sed -i 's/seeds = ""/seeds = "ade4d8bc8cbe014af6ebdf3cb7b1e9ad36f412c0@seeds.polkachu.com:14956"/' ~/.gaia/config/config.toml
  4. download snapshot from Polkachu: https://snapshots.polkachu.com/snapshots/cosmos/cosmos_18031441.tar.lz4
  5. after extract snapshot data and mv it to gaia/data -> gaiad start cmd will work.
  6. download full archive from Quicksync: https://quicksync.io/networks/cosmos.html and extract