Open lmatz opened 2 months ago
One example: https://buildkite.com/risingwave-test/sysbench/builds/755#018ef126-d6ee-41e0-a309-606a9c89119d
CN and FN FlameGraph under the artifacts
tab.
oltp-insert workload. no checks in MV executor (disabled index): https://github.com/risingwavelabs/sysbench/blob/master/src/lua/oltp_insert.lua#L50-L61
Frontend is using close to 7CPUs out of 8CPUs in total. Frontend CPU usage is much higher than Compute CPU usage.
Probably we optimize the CPU usage of Frontend.
Considering that the insert statement is the same, I suppose FN does not need to spend so much time on gen_batch_plan_by_statement
?
Need something like a plan cache? cc: @chenzl25
Considering that the insert statement is the same, I suppose FN does not need to spend so much time on
gen_batch_plan_by_statement
? Need something like a plan cache? cc: @chenzl25
Theoretically, yes. Plan cache can improve the performance in this scenario, but it has some shortcomings as well, i.e. it needs to normalize the SQL and parameterize it which would introduce an additional overhead, furthermore, from the optimizer view, during optimization we can't see the actual parameter anymore which would introduce a huge refactoring to RisingWave. With a plan cache, people might introduce a new optimization without considering how it will affect the optimization time anymore, so I think we'd better do not introduce plan cache.
We used to run sysbench on 3 8C16G CN: dashboard: http://metabase.risingwave-cloud.xyz/dashboard/278-sysbench-3cn-rw-qps example: https://buildkite.com/risingwave-test/sysbench/builds/698#018e061f-1719-4907-9c2c-15bddb895327
But lately we have been running sysbench on 1 8C16G CN: dashboard: http://metabase.risingwave-cloud.xyz/dashboard/1133-sysbench-1cn-rw-qps example: https://buildkite.com/risingwave-test/sysbench/builds/746#018ee38f-d9de-4f54-b698-7e7518a4a829
In Sysbench, we have two insert benchmarks:
There are feedback from users that the insert performance is not desirable. We want to improve it.
One observation is that 3CN(128 sysbench threads) vs 1CN(128 sysbench threads) vs 1CN (256 sysbench threads) do not differ from each other too much.
Some other common observations:
We enabled CPU profiling by default. The CPU flamegraph is generated and uploaded in buildkite pipelines, e.g. https://buildkite.com/risingwave-test/sysbench/builds/746#018ee38f-d9de-4f54-b698-7e7518a4a829 under
artifacts
tab.