jrheakv multi-group question

guotaisu commented 3 years ago

1.多组部署时候如何充分利用多块磁盘io的，例如部署10个组分别对应10块盘，官网Benchmark demo 中使用yaml配置24个region, 并未见对每个分组单独设置dbPath和raftDataPath，官网Benchmark测试数据是基于单块磁盘io读写吗？可否使用StoreEngineOptionsConfigured、RheaKVStoreOptionsConfigured配置类演示下如何对不同分组配置dbPath和raftDataPath，实现真正意义上的分组。

2.本人使用更高的物理机配置直接压测Benchmark demo 无法得到官网的性能数据。

线程数：500 读写比例：9 ：1 valueSize：16

主要异常：节点点状态异常，并且长时间无法自愈客户端：BenchmarkClient 2020-09-04 15:54:09 [Bolt-default-executor-4-thread-7] ERROR FailoverClosureImpl:75 - [InvalidEpoch-Failover] status: Status[UNKNOWN<-1>: RPC failed with address: LF-HBASE-XUANWU-172-20-162-70.hadoop.jd.local:18091, response: BaseResponse{error=LEADER_NOT_AVAILABLE, regionId=7, regionEpoch=RegionEpoch{confVer=-1, version=-1}, value=null}], error: LEADER_NOT_AVAILABLE, 0 retries left. 2020-09-04 15:54:09 [Bolt-default-executor-4-thread-1] ERROR FailoverClosureImpl:75 - [InvalidEpoch-Failover] status: Status[UNKNOWN<-1>: RPC failed with address: LF-HBASE-XUANWU-172-20-162-70.hadoop.jd.local:18091, response: BaseResponse{error=NOT_LEADER, regionId=7, regionEpoch=RegionEpoch{confVer=-1, version=-1}, value=null}], error: NOT_LEADER, 0 retries left. 2020-09-04 15:54:09 [Bolt-default-executor-4-thread-7] ERROR MapFailoverFuture:77 - [InvalidEpoch-Failover] cause: com.alipay.sofa.jraft.rhea.errors.LeaderNotAvailableException: The leader is not available. 2020-09-04 19:09:04 [Bolt-default-executor-4-thread-102] ERROR BoolFailoverFuture:77 - [InvalidEpoch-Failover] cause: com.alipay.sofa.jraft.rhea.errors.NotLeaderException: This is not the correct leader.

服务端：BenchmarkServer 2020-09-04 15:54:23 [rheakv-read-index-callback #117] ERROR DefaultRegionKVService:667 - Failed to handle: GetRequest{key=373A636C69656E74696E666F3A31323334353637, readOnlySafe=true} BaseRequest{regionId=7, regionEpoch=RegionEpoch{confVer=-1, version=-1}}, status: Status[EPERM<1008>: Invalid state for readIndex: STATE_ERROR.], error: LEADER_NOT_AVAILABLE. 2020-09-04 16:00:00 [rhea_benchmark-7/LF-172-20-162-70.hadoop.jd.local:18091-AppendEntriesThread0] WARN NodeImpl:1875 - Node <rhea_benchmark-7/LF-172-20-162-70.hadoop.jd.local:18091> is not in active state, currTerm=16.

以上，请指教，非常感谢。

fengjiachun commented 3 years ago

benchmark 就是写单块盘，所有 region 共享一个 rocksdb 实例，别人已压测出1.5倍以上数据，你需要给出具体情况，没法去猜

guotaisu commented 3 years ago

benchmark 就是写单块盘，所有 region 共享一个 rocksdb 实例，别人已压测出1.5倍以上数据，你需要给出具体情况，没法去猜

如本人描述，是否支持利用多块盘，个人理解的多组部署多块磁盘上才是正则意义上的多组多io

fengjiachun commented 3 years ago

benchmark 就是写单块盘，所有 region 共享一个 rocksdb 实例，别人已压测出1.5倍以上数据，你需要给出具体情况，没法去猜

如本人描述，是否支持利用多块盘，个人理解的多组部署多块磁盘上才是正则意义上的多组多io

multi group 核心是解决单个 raft group 只有 leader 可写的瓶颈，以及容量上的水平扩展等，意义不在磁盘 IO，不过确实可以考虑优化，当前不是很有必要

fengjiachun commented 3 years ago

看日志，和 benchmark 场景不一样，可以先从基础的来，按照 bencnmark 场景去验证，各个参数调到最优再测自己的场景

guotaisu commented 3 years ago

以及容量上的水平扩展等

一直没太理解您这里说的水平扩展，就拿bencnmark 来讲，同一个集群，3 个 server 扩展到5、7、9台。。。此时是如何做容量水平扩展，个人理解这里只能是对每个region 扩副本冗余数量，由3个副本扩到5、7、9个副本.。。。

killme2008 commented 3 years ago

region 要 split ，每个 region 的 leader 分散到不同的机器打散。不过后续对于 raft group 是可以设置不同的磁盘路径的，底层 jraft-core 是支持的， rheakv 可以考虑支持下。

fengjiachun commented 3 years ago

如果要写 raft log 到多块盘，每个 region 都单独配置 RegionEngineOptions.raftDataPath 即可, rheakv 基于 rocksdb 的存储引擎目前不支持写多块盘

killme2008 commented 3 years ago

@fengjiachun 我们开放下 DBOptions::wal_dir 选项吧？可以将 wal 和 sst 分开存储。

fengjiachun commented 3 years ago

@fengjiachun 我们开放下 DBOptions::wal_dir 选项吧？可以将 wal 和 sst 分开存储。

开放的，通过com.alipay.sofa.jraft.util.StorageOptionsFactory#registerRocksDBOptions 设置，不过 jraft 本就能配置不同目录，rheakv 只使用了一个 rocksdb 实例，而且关闭了 wal （依靠 jraft 的 snapshot + raft log 就够了，不需要 wal）

killme2008 commented 3 years ago

没有新的反馈就关闭了

sofastack / sofa-jraft

jrheakv multi-group question #506