[BUG] [perf] Stability testing, out of memory exeption

timofeevmd commented 1 month ago

OS and Environment

linux, k8s

GIT commit hash

995da4ec

Minimum working example / Steps to reproduce

java-sdk commit: 9aba55183e

reproduce command: ./mvnw gatling:test -DtargetURL=https://iroha2.test.tachi.soramitsu.co.jp -DremoteLogin= -DremotePassword= -Dintensity=800000 -DrampDuration=3600 -DmaxDuration=3601

full performance report

container logs: OOM.log.zip

Under a load of 1100 TPS, the system is able to function for only 44 minutes whitout any arrors and rejected transaction before encountering issues. (graph 1). Following this period, a memory leak. (graph 2) .

graph 1

graph 2

Actual result

the product works stably

Expected result

after 45 minutes working catch "out if memory" exception

Logs

Log contents

Who can help to reproduce?

@timofeevmd

Notes

No response

Mingela commented 1 month ago

@timofeevmd What kind of transactions are being sent? In case there are new entities being registered (like asset definitions, accounts, domains) those consume memory naturally so the leak have to be justified.

Mingela commented 1 month ago

UPD; found out that's an asset transfer, though transactions (and blocks) itself consume memory so it's up to the team to analyze the consumption and conclude whether it's reasonable, thanks

dima74 commented 1 month ago

Based on my testing, size of a transaction with a single transfer asset instruction is ~1.8KB (CommittedTransaction struct). So for 1 million transactions iroha will consume ~1.8GB, for 3 million transactions ~5.6GB. This matches results in this issue.

I see two options how memory consumption can be improved:

Make kura drop old blocks from the memory. If we need such blocks later, we can load them from the disk (#4954)
Optimize CommittedTransaction struct size (more on this later)

dima74 commented 1 month ago

Analysis of CommittedTransaction struct memory layout.

Key results:

Minimal possible size of a transaction is 1436 bytes
Typical size of a transaction with a single transfer asset instruction is 2028 bytes

Possible optimizations:

CommittedTransaction::error takes 200 bytes but is None for a successful transaction - #5089
Use read-only Box<[InstructionBox]> instead of Vec<InstructionBox> in Executable? Minimal capacity of a Vec is 4 elements, so for a transaction with a single instruction we have 176*3=528 unused space for a buffer
Something with public keys similar to lazy-decoding in #5048? Single public key is 296 bytes, for a transfer asset transaction we have three accounts inside so this is total 888 bytes just for public keys
Use Arc<str> instead of Box<str> inside ChainId?

// self: 336
// heap: 1100 + instructions heap
struct CommittedTransaction {
    // self: 136
    // heap: 1100 + instructions heap
    value: SignedTransaction,
    // self: 200
    error: Option<error::TransactionRejectionReason>,
}

// self: 136
// heap: 1100 + instructions heap
struct SignedTransactionV1 {
    // self: 16
    // heap: 64
    signature: TransactionSignature,
    // self: 120
    // heap: 1036 + instructions heap
    payload: TransactionPayload,
}

// self: 120
// heap: 1036 + instructions heap
struct TransactionPayload {
    // self: 16, heap: 36 for default chain id
    chain: ChainId,
    // self: 24, heap: ~296
    authority: AccountId,
    // self: 8
    creation_time_ms: u64,
    // self: 32,
    // heap: 704 + instructions heap
    instructions: Executable,
    // self: 8
    time_to_live_ms: Option<NonZeroU64>,
    // self: 4
    nonce: Option<NonZeroU32>,
    // self: 24, heap: 0 if empty
    metadata: Metadata,
}

// self: 32
// heap: 176 * (instructions capacity) + instructions heap
//
// for a transaction with a single transfer instruction (4 is min Vec capacity):
// heap: 176 * 4 + ~592
enum Executable {
    Instructions(Vec<InstructionBox>),
    Wasm(WasmSmartContract),
}

// self: 176
enum InstructionBox {
    ...
}

// self: 96
// heap: ~592
struct Transfer<Asset, Numeric, Account> {
    // self: 56, heap: ~296
    source: AssetId,
    // self: 16
    object: Numeric,
    // self: 24, heap: ~296
    destination: AccountId,
}

// self: 56, heap: ~296
struct AssetId {
    // self: 24, heap: ~296
    account: AccountId,
    // self: 32, heap: 0 for domain and asset name <16
    definition: AssetDefinitionId,
}

// self: 24, heap: ~296 heap
struct AccountId {
    // self: 16, heap: 0 for domains with len <16
    domain: DomainId,
    // self: 8, heap: 296
    signatory: PublicKey,
}

// 296
pub enum PublicKeyInner {
    // 192
    Ed25519(ed25519::PublicKey),
    // 104
    Secp256k1(secp256k1::PublicKey),
    // 144
    BlsNormal(bls::BlsNormalPublicKey),
    // 288
    BlsSmall(bls::BlsSmallPublicKey),
}

DCNick3 commented 1 month ago

Use read-only Box<[InstructionBox]> instead of Vec in Executable? Minimal capacity of a Vec is 4 elements, so for a transaction with a single instruction we have 176*3=528 unused space for a buffer

Huh, I didn't know of such behavior. Notably, when doing vec![123], it allocates the capacity exactly, but with let mut v = Vec::new(); v.push(123); it does indeed overallocate the capacity to 4 to amortize future pushes.

(We also have a ConstVec<T> newtype around Box<[T]> to do the right thing with the scale encoding)

dima74 commented 1 month ago

Actually, we still have capacity=1 since we clone transactions in Sumeragi::try_create_block (this is a single peer case, not sure about multiple peers). Anyway would be nice to replace it with Box<[T]> to have explicit behaviour. Opened #5092

fn main() {
    let mut v = Vec::new();
    v.push(1);

    println!("{}", v.capacity());  // 4

    let v2 = v.clone();
    println!("{}", v2.capacity());  // 1
}

dima74 commented 4 weeks ago

With #5096 and #5103 merged, memory usage expected to be the following:

fixed ~200MB for last 128 blocks stored in kura in memory
unbounded ~260 bytes per transaction (State::transactions map)

So after 1mln transactions iroha should consume ~450MB, after 10mln transactions ~2.7GB.

State::transactions is map from HashOf<SignedTransaction> to NonZeroUsize, so entry size is 40 bytes (32+8). But this is BTreeMap, so some keys are stored multiple times in intermediate nodes. Also nodes store some data related to concread crate. This gives total ~260 bytes per transaction. Not sure how this can be optimized further.

hyperledger-iroha / iroha