opencurve / curve

Curve is a sandbox project hosted by the CNCF Foundation. It's cloud-native, high-performance, and easy to operate. Curve is an open-source distributed storage system for block and shared file storage.
https://opencurve.io
Apache License 2.0
2.32k stars 521 forks source link

curvefs cannot create fs when cluster degraded #2964

Open h0hmj opened 9 months ago

h0hmj commented 9 months ago

Describe the bug (描述bug)

failed to create fs when a zone is done

To Reproduce (复现方法)

  1. curveadm deploy a new cluster 3 etcd 3 mds 3*3 metaserver
  2. stop a node with 3 metaserver, wait for status switch to offline
  3. mount(createfs)

create fs failed, errorcode= 22, error name: CREATE_PARTITION_ERROR CREATEFS FAILED

mds log W 2023-12-13T16:28:09.440230+0800 33 topology.cpp:1455] can not find available metaserver for copyset, poolId = 1 need replica num = 3, but only has avai lable zone num = 2 W 2023-12-13T16:28:09.440254+0800 33 topology.cpp:1538] Initial Generate copyset addr for pool 1 fail, statusCode = TOPO_METASERVER_NOT_FOUND I 2023-12-13T16:28:09.440264+0800 33 topology.cpp:1554] GenSubsequentCopysetAddrBatch needCreateNum = 12, copysetList size = 0 begin W 2023-12-13T16:28:09.440301+0800 33 topology.cpp:1455] can not find available metaserver for copyset, poolId = 1 need replica num = 3, but only has avai lable zone num = 2 W 2023-12-13T16:28:09.440307+0800 33 topology.cpp:1597] Generate 12 copyset addr for pool 1fail, statusCode = TOPO_METASERVER_NOT_FOUND E 2023-12-13T16:28:09.440313+0800 33 topology.cpp:1606] can not find available metaserver for copyset. E 2023-12-13T16:28:09.440335+0800 33 topology_manager.cpp:847] create copyset generate copyset addr fail, createNum = 12 E 2023-12-13T16:28:09.440346+0800 33 topology_manager.cpp:713] Create copyset failed when create partition. E 2023-12-13T16:28:09.440400+0800 33 fs_manager.cpp:459] CreateFs fail, create partition fail, fsId = 1 E 2023-12-13T16:28:09.440428+0800 33 fs_manager.cpp:479] CreateFs fail, insert root inode fail, fsName = test, ret = CREATE_PARTITION_ERROR E 2023-12-13T16:28:09.445354+0800 33 mds_service.cpp:89] CreateFs fail, fsName = test, blockSize = 1048576, s3Info.bucketname = local, enableSumInDir = 0, owner = anonymous, capacity = 0, errCode = CREATE_PARTITION_ERROR

Expected behavior (期望行为)

create fs successfully