Spycsh / hesse

a temporal graph analytics library based on Flink Stateful Functions
11 stars 4 forks source link

Exploring appropriate parallelism setting #10

Closed Spycsh closed 2 years ago

Spycsh commented 2 years ago

Appropriate parallelism setting needs to be investigated for better performance.

It will also cause very slow storage for the extreme scenario when setting slot number to be 64 and parallelism to be 1

2022-06-20 12:22:27,349 WARN  org.apache.flink.contrib.streaming.state.RocksDBOperationUtils [] - RocksDBStateBackend
 performance will be poor because of the current Flink memory configuration! RocksDB will flush memtable constantly, 
causing high IO and CPU. Typically the easiest fix is to increase task manager managed memory size. If running locally, see 
the parameter taskmanager.memory.managed.size. Details: arenaBlockSize 8388608 > mutableLimit 6557095 
(writeBufferSize = 67108864, arenaBlockSizeConfigured = 0, defaultArenaBlockSize = 8388608, writeBufferManagerCapacity
 = 7493823)
Spycsh commented 2 years ago

In one TaskManager, taskmanager.numberOfTaskSlots and parallelism.default are set 1, 2, 3... to see the performance difference