opencurve / curve

Curve is a sandbox project hosted by the CNCF Foundation. It's cloud-native, high-performance, and easy to operate. Curve is an open-source distributed storage system for block and shared file storage.
https://opencurve.io
Apache License 2.0
2.33k stars 522 forks source link

单机版本部署,格式化 /dev/nbd0 失败 #20

Closed lwllvyb closed 4 years ago

lwllvyb commented 4 years ago

Describe the bug (描述bug)

  1. mkfs.ext4 /dev/nbd0 失败,以下查询信息:

Cluster status: Get status metric from 127.0.0.1:8081 fail No snapshot-clone-server is active snapshot-clone-server 127.0.0.1:5556 is offline cluster is not healthy total copysets: 100, unhealthy copysets: 0, unhealthy_ratio: 0% physical pool number: 1, logical pool number: 1 total space = 122021132GB, logical used = 24GB(0.00%, can be recycled = 0GB(0.00%)), physical used = 1GB(0.00%)

Client status: nebd-server: version-0.1.0: 1

MDS status: version: 0.1.0 current MDS: 127.0.0.1:6666 online mds list: 127.0.0.1:6666 offline mds list:

Etcd status: version: 3.4.0 current etcd: 127.0.0.1:2379 online etcd list: 127.0.0.1:2379 offline etcd list:

SnapshotCloneServer status: no version found! GetAndCheckSnapshotCloneVersion fail Get status metric from 127.0.0.1:8081 fail current snapshot-clone-server: online snapshot-clone-server list: offline snapshot-clone-server list: 127.0.0.1:5556

ChunkServer status: version: 0.1.0 chunkserver: total num = 3, online = 3, offline = 0(recoveringout = 0, chunkserverlist: []) left size: min = 20278169GB, max = 56282713GB, average = 40673710.33GB, range = 36004544GB, variance = 227510007645070.22

id image device 18042 cbd:pool//testcurve /dev/nbd0

Welcome to fdisk (util-linux 2.27.1). Changes will remain in memory only, until you decide to write them. Be careful before using the write command.

fdisk: cannot open /dev/nbd0: Input/output error

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 10G 0 disk └─sda1 8:1 0 10G 0 part / sdb 8:16 0 10M 0 disk nbd0 43:0 0 10G 0 disk root@ubuntu-xenial:/home/vagrant#

  1. ubuntu16.04 ,按照单机版本部署 logicalpools/name 改为了 2 curve 分支: master

ChunkServer 日志:server.tar.gz /data/ 日志:data.tar.gz

wu-hanqing commented 4 years ago

从日志中看,是IO请求失败,导致IO Error,curve这边会限制IO请求的offset/size需要是4K对齐的。

E 2020-07-21T07:21:23.570824+0000  1209 file_entity.cpp:171] AioRead file failed. fd: 1, fileName: cbd:pool//t
est_curve_, context: [type: READ, offset: 1024, size: 1024, ret: -1]

看下 /sys/devices/virtual/block/nbd0/queue/logical_block_size/sys/devices/virtual/block/nbd0/queue/physical_block_size 这两个文件的值,如果不是4096,可以尝试修改成4096。 如果能生效,可以重新格式化,再测试一下。 如果不能修改,尝试升级下内核版本,我们内部测试使用的是 4.9.65 版本的内核。

lwllvyb commented 4 years ago

通过升级内核之后,解决了。 感觉日志可以优化下,日志信息不是很明确