NethermindEth / nethermind

A robust execution client for Ethereum node operators.
https://nethermind.io/nethermind-client
GNU General Public License v3.0
1.26k stars 433 forks source link

[Bug] [1.10.41] [Archive Sync] Utilizing High DB Memory Caching #2885

Closed Texnomic closed 4 months ago

Texnomic commented 3 years ago

I'm trying to find the sweet spot to achieve peak bps performance by utilizing RockDB Memory Caches & Buffers configurations.

Targets:

  1. Lower the number of Random Disk IOs
  2. Increase the number of Sequential Disk IOs
  3. Decouple Block Processing Speed from Disk IOPs Speed
  4. Utilizing RAM for All Hot Db Records

Here is an example of my config file:

  "Db": {
    "BlockCacheSize": 4096000000,
    "ReceiptsDbBlockCacheSize": 4096000000,
    "BlocksDbBlockCacheSize": 4096000000,
    "HeadersDbBlockCacheSize": 4096000000,
    "BlockInfosDbBlockCacheSize": 4096000000,
    "PendingTxsDbBlockCacheSize": 4096000000,
    "CodeDbBlockCacheSize": 4096000000,
    "BloomDbBlockCacheSize": 4096000000,
    "WitnessDbBlockCacheSize": 4096000000,
    "CanonicalHashTrieDbBlockCacheSize": 4096000000,

    "WriteBufferSize": 2048000000,
    "ReceiptsDbWriteBufferSize": 2048000000,
    "BlocksDbWriteBufferSize": 2048000000,
    "HeadersDbWriteBufferSize": 2048000000,
    "BlockInfosDbWriteBufferSize": 2048000000,
    "PendingTxsDbWriteBufferSize": 2048000000,
    "CodeDbWriteBufferSize": 2048000000,
    "BloomDbWriteBufferSize": 2048000000,
    "WitnessDbWriteBufferSize": 2048000000,
    "CanonicalHashTrieDbWriteBufferSize": 2048000000,

    "WriteBufferNumber": 4,
    "ReceiptsDbWriteBufferNumber": 4,
    "BlocksDbWriteBufferNumber": 4,
    "HeadersDbWriteBufferNumber": 4,
    "BlockInfosDbWriteBufferNumber": 4,
    "PendingTxsDbWriteBufferNumber": 4,
    "CodeDbWriteBufferNumber": 4,
    "BloomDbWriteBufferNumber": 4,
    "WitnessDbWriteBufferNumber": 4,
    "CanonicalHashTrieDbWriteBufferNumber": 4,

    "CacheIndexAndFilterBlocks": true,
    "ReceiptsDbCacheIndexAndFilterBlocks": true,
    "BlocksDbCacheIndexAndFilterBlocks": true,
    "HeadersDbCacheIndexAndFilterBlocks": true,
    "BlockInfosDbCacheIndexAndFilterBlocks": true,
    "PendingTxsDbCacheIndexAndFilterBlocks": true,
    "CodeDbCacheIndexAndFilterBlocks": true,
    "BloomDbCacheIndexAndFilterBlocks": true,
    "WitnessDbCacheIndexAndFilterBlocks": true,
    "CanonicalHashTrieDbCacheIndexAndFilterBlocks": true
  },

I've already achieved 5x increase in bps performance by replacing default values. but I guess we can achieve higher performance if we have better understanding of these settings and how they correlate together.

tkstanczak commented 3 years ago

We used to set these and used to leave it in config, then we realized it is simply too hard to understand and replaced it with the MemoryHintMan.

More or less when you use --Init.MemoryHint then all these values are replaced with different values. The only settings that are relevant (are not replaced) are these:

"CacheIndexAndFilterBlocks": true,
"ReceiptsDbCacheIndexAndFilterBlocks": true,
"BlocksDbCacheIndexAndFilterBlocks": true,
"HeadersDbCacheIndexAndFilterBlocks": true,
"BlockInfosDbCacheIndexAndFilterBlocks": true,
"PendingTxsDbCacheIndexAndFilterBlocks": true,
"CodeDbCacheIndexAndFilterBlocks": true,
"BloomDbCacheIndexAndFilterBlocks": true,
"WitnessDbCacheIndexAndFilterBlocks": true,
"CanonicalHashTrieDbCacheIndexAndFilterBlocks": true

and these setting indeed can improve a lot the speed of the node but can also cause it to stop limiting memory in some circumstance.

It makes perfect sense to adjust them to your scenario.

Texnomic commented 3 years ago

Thank you @tkstanczak for your fast response.

I'm already using MemoryHint and setting it to 32 GB, but unfortunately, it was using around 2 GB overall memory.

When I started setting the other Db options, now it uses up to 7 GB but it is still far cry from my target.

Now I have one important question: is MemoryHintMan overriding my settings ? and if yes, how to disable it.

Texnomic commented 3 years ago

Ok, I figured out how to disable MemoryHint but still I can't exceed 7 GB overall usage.

Also, my bps still looks bad (it was worse at 0.5 bps)

2021-03-12 00:56:34.4763|Processed 4327357 | 1,040ms, mgasps 38.84 total 30.34, tps 827.87 total 600.67, bps 2.88 total 2.94, recv queue 5252, proc queue 2000

Note: I'm running on HPE Server with RAID-0 8x 1 TB Samsung EVO SATA SSDs.

Texnomic commented 3 years ago

@tkstanczak I can now confirm we have an issue with MemoryHint.

Setting MemoryHint to 25 GB while having the following DB Config, leads to 0.5 bps maximum.

  "Db": {
    "CacheIndexAndFilterBlocks": true,
    "ReceiptsDbCacheIndexAndFilterBlocks": true,
    "BlocksDbCacheIndexAndFilterBlocks": true,
    "HeadersDbCacheIndexAndFilterBlocks": true,
    "BlockInfosDbCacheIndexAndFilterBlocks": true,
    "PendingTxsDbCacheIndexAndFilterBlocks": true,
    "CodeDbCacheIndexAndFilterBlocks": true,
    "BloomDbCacheIndexAndFilterBlocks": true,
    "WitnessDbCacheIndexAndFilterBlocks": true,
    "CanonicalHashTrieDbCacheIndexAndFilterBlocks": true
  },

While removing MemoryHint from config and depending on the following DB Config increases performance to 2.0 bps maximum.

  "Db": {

    "BlockCacheSize": 25600000000,
    "ReceiptsDbBlockCacheSize": 25600000000,
    "BlocksDbBlockCacheSize": 25600000000,
    "HeadersDbBlockCacheSize": 25600000000,
    "BlockInfosDbBlockCacheSize": 25600000000,
    "PendingTxsDbBlockCacheSize": 25600000000,
    "CodeDbBlockCacheSize": 25600000000,
    "BloomDbBlockCacheSize": 25600000000,
    "WitnessDbBlockCacheSize": 25600000000,
    "CanonicalHashTrieDbBlockCacheSize": 25600000000,

    "CacheIndexAndFilterBlocks": true,
    "ReceiptsDbCacheIndexAndFilterBlocks": true,
    "BlocksDbCacheIndexAndFilterBlocks": true,
    "HeadersDbCacheIndexAndFilterBlocks": true,
    "BlockInfosDbCacheIndexAndFilterBlocks": true,
    "PendingTxsDbCacheIndexAndFilterBlocks": true,
    "CodeDbCacheIndexAndFilterBlocks": true,
    "BloomDbCacheIndexAndFilterBlocks": true,
    "WitnessDbCacheIndexAndFilterBlocks": true,
    "CanonicalHashTrieDbCacheIndexAndFilterBlocks": true
  },

That's 4x performance increase which can't be understated.

tkstanczak commented 3 years ago

@theoanab -> would you review these suggestions from @Texnomic

Texnomic commented 3 years ago

@theoanab & @tkstanczak It appears that all of the above Db options has been removed and the following is what's left from it:

"Db": {
    "CacheIndexAndFilterBlocks": true,
    "BlockCacheSize": 81920000000,
    "WriteBufferSize": 51920000000,
    "WriteBufferNumber": 1
  }

Kindly confirm if I'm correct.

McSim85 commented 2 years ago

@Texnomic We have the same issue. And thank you for sharing the details.

Do you know if these numbers sum? I mean in your case when you set

   "BlockCacheSize": 81920000000,
    "WriteBufferSize": 51920000000,

Does that mean, the process consumes 133840000000 (133G) total?

asdacap commented 1 year ago

There is a problem with nethermind's cache setting where all db is using the same cache. That alone is not a problem, but the cache probably is too low considering the size of the DB. We also turn on CacheIndexAndFilterBlocks meaning the blocks's index is stored within the same cache also, which reduces memory, at expense of throughput. On goerli with default memory hint its 64MB. Not sure on mainnet. Db.BlockCacheSize is the config. So setting that to very high probably will help. Other db's BlockCacheSize actually does not take into effect as the OptimizeForPointLookup's block option is not applied. If you have very high RAM in the first place you might not notice this problem. The WriteBufferSize will probably help with blocks db, as blocks are large, it will probably trigger compaction all the time.

kamilchodola commented 4 months ago

Is it still valid knowing recent master improvements? Memory Hint will be removed probably, right?

@asdacap @LukaszRozmej @benaadams

asdacap commented 4 months ago

Way too many things changed not just recently. This issue should no longer be valid. Halfpath does not shows much difference between using block cache or os cache. Write buffer size no longer set by memory hint. RLP cache is not in rocksdb not in .net.