hyperledger / iroha

Iroha - A simple, enterprise-grade decentralized ledger
https://wiki.hyperledger.org/display/iroha
Apache License 2.0
433 stars 277 forks source link

[BUG] bandwidth degradation #4727

Open timofeevmd opened 2 months ago

timofeevmd commented 2 months ago

OS and Environment

linux, k8s

GIT commit hash

0275cfa

Minimum working example / Steps to reproduce

Test preconditions

Create:

Test

Actual result

Throughput does not increase:

profiler

grafana

queue

Problems detected with modules iroha_core::gossiper::TransactionGossiper::run irohacore::::<impl parity_scale_codec::codec::Decode for iroha_core iroha_core::sumeragi::main_loop::handle_block_sync

Expected result

Throughput increases in proportion to the load

Logs

Log contents ```json Replace this text with a JSON log, so it doesn't grow too large and has highlighting. ```

Who can help to reproduce?

@timofeevmd

Notes

No response

Erigara commented 2 months ago

Should be noticed as well that degradation happens in case of single peer in the network as well.

dima74 commented 1 month ago

I tried to reproduce it locally and apart from degradation of tps over time, noticed that tps is very unstable.

With single peer and configuration as in this report, and given constant load of 1600 requests per second (one request is one transaction with single transfer asset instruction, some requests may return error if queue is full), I receive following results: plot_1600_tps

I think this is related to Queue::tx_hashes. When iroha creates block, it takes transactions from tx_hashes, and then immediatelly pushes same transactions back to tx_hashes. Then iroha applies block, but its transaction are still in tx_hashes, and they will be removed only after all other transactions are handled. I think tx_hashes should be changed from queue to deque, and transactions for the block should be placed to the end of the deque. This will make tps numbers more stable. Even with very naive implementaion of the deque (using Mutex<VecDeque<...>>), I receive much more stable tps numbers. @mversic what do you think about changing Queue::tx_hashes from queue to deque? plot_1600_tps_deque

Erigara commented 1 month ago

Hmm, we use ArrayQueue which is already follows FIFO.

dima74 commented 1 month ago

@Erigara yes, but the thing is, that currently each transaction is pushed two times to the queue. E.g. consider queue capacity is 8 and max transaction in block = 2:

So in the queue will be a lot of transactions for old blocks, such transaction will be eventually removed from the queue, but because of this tps numbers fluctuates a lot (first tps plot)

Erigara commented 1 month ago

And you want to be able to push transactions back at the beginning of the queue so that they are removed from the queue earlier?

dima74 commented 1 month ago

And you want to be able to push transactions back at the beginning of the queue so that they are removed from the queue earlier?

Yes, I think this will improve stability of tps numbers

Erigara commented 1 month ago

Your idea got me thinking that we can pull something similar without changing type of the queue. Gist is the following: 1) Leader pop transactions from the queue and they are removed temporarily (either entirely or just hashes) 2) If block was committed it is great we don't have to do anything (maybe remove transaction from the map) 3) If block failed leader return this transactions in the queue

dima74 commented 1 month ago

Gist is the following:

  1. Leader pop transactions from the queue and they are removed temporarily (either entirely or just hashes)
  2. If block was committed it is great we don't have to do anything (maybe remove transaction from the map)
  3. If block failed leader return this transactions in the queue

Good idea, I will try to implement it

dima74 commented 4 weeks ago

The reason of tps degradation over time in case of single peer is that Sumeragi::try_create_block() method becomes slower as block height increases. Here is analysis of that method (~90 minutes of ~1600tps load).

Two methods are the reason of time per block increase:

sumeragi_try_create_blockpng

dima74 commented 1 day ago

Update on single peer tps profiling.

WASM code for transaction validation takes >60% of all sumeragi thread:

image

Hot places inside WASM code:

mversic commented 21 hours ago

is this the case even after #4995? If yes, then I can only suggest that we consider some faster serialization format. Because I think SCALE is pretty fast we can only go for a zero-copy serialization

Is it the act of decoding that is slow or the act of copying bytes into linear memory? if it's the latter than I don't see what we can do

dima74 commented 17 hours ago

is this the case even after #4995?

Yes, this is 02479ce598854c35c106831e30f7223964703db3

Is it the act of decoding that is slow or the act of copying bytes into linear memory?

Decoding (methods from curve25519 crate like square, pow2k)