basho / riak

Riak is a decentralized datastore from Basho Technologies.
http://docs.basho.com
Apache License 2.0
3.95k stars 537 forks source link

RIak Bitcask primary partition Failed to merge #1119

Closed Konstantin74R closed 2 years ago

Konstantin74R commented 2 years ago

Hi, All! riak 2.1.1 backend multi n_val = 3 In log /var/log/riak/crash.log often writes event. Primary partition. Please help. What can I do? Failed to merge {[ "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2689.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2681.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2943.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2940.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2938.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2936.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2932.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2930.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2928.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2926.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2924.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2922.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2920.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2918.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2916.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2914.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2912.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2910.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2908.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2906.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2904.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2900.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2898.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2890.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2886.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2882.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2880.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2878.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2876.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2872.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2870.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2868.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2866.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2864.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2814.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2812.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2810.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2808.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2806.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2802.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/446.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2896.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2894.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2892.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2888.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2691.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2884.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2695.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2693.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2679.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2677.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2874.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2934.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2609.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2585.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2902.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2798.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2790.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2448.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2446.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2707.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2721.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2719.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2717.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2715.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2713.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2711.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2709.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2699.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2697.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/3.bitcask.data", "/u01/riak/bitcask/59944403093650315004448010716878795728075358208/2.bitcask.data"],[]}: {generic_failure,error,{try_clause,{error,eio}}, [{bitcask,merge_files,1,[{file,"src/bitcask.erl"},{line,1358}]}, {bitcask,merge1,4,[{file,"src/bitcask.erl"},{line,701}]}, {bitcask,merge,3,[{file,"src/bitcask.erl"},{line,580}]}, {bitcask_merge_worker,do_merge,1,[{file,"src/bitcask_merge_worker.erl"},{line,195}]}]}

martinsumner commented 2 years ago

During a merge it tried to open a file and the Operating System returned the POSIX error EIO.

EIO isn't helpful - it just means Input/Output error (https://www.man7.org/linux/man-pages/man3/errno.3.html), which seems quite generic

So you'll need to look for reasons why your filesystem returned an error at this point. I would expected the common issues (e.g. reaching open files limit, out of disk space, permissions) to give a more helpful error. Perhaps there was a hardware issue?

Konstantin74R commented 2 years ago

Thanks, I'm searching problems. How can I delete primary partition from bitcask and repair from leveldb or from secondary partition on another node?

martinsumner commented 2 years ago

https://docs.riak.com/riak/kv/2.2.3/using/repair-recovery/repairs.1.html#repairing-partitions

Konstantin74R commented 2 years ago

Thanks, I found first problem and is not end for my problem resolve. dmesq [ 946.650194] sd 0:0:7:0: [sdg] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=1s [ 946.650229] sd 0:0:7:0: [sdg] tag#4 Sense Key : Medium Error [current] [ 946.650247] sd 0:0:7:0: [sdg] tag#4 Add. Sense: Unrecovered read error [ 946.650266] sd 0:0:7:0: [sdg] tag#4 CDB: Read(10) 28 00 41 84 c7 00 00 01 00 00 [ 946.650287] blk_update_request: critical medium error, dev sdg, sector 1099220790 [ 948.404314] sd 0:0:7:0: [sdg] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=1s [ 948.404338] sd 0:0:7:0: [sdg] tag#0 Sense Key : Medium Error [current] [ 948.404355] sd 0:0:7:0: [sdg] tag#0 Add. Sense: Unrecovered read error [ 948.404372] sd 0:0:7:0: [sdg] tag#0 CDB: Read(10) 28 00 41 84 c7 30 00 00 08 00 [ 948.404390] blk_update_request: critical medium error, dev sdg, sector 1099220790 [ 950.084214] sd 0:0:7:0: [sdg] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=1s [ 950.084241] sd 0:0:7:0: [sdg] tag#0 Sense Key : Medium Error [current] [ 950.084431] sd 0:0:7:0: [sdg] tag#0 Add. Sense: Unrecovered read error [ 950.084622] sd 0:0:7:0: [sdg] tag#0 CDB: Read(10) 28 00 41 84 c7 30 00 00 08 00 [ 950.084977] blk_update_request: critical medium error, dev sdg, sector 1099220790 [ 1147.227892] sd 0:0:7:0: [sdg] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=1s [ 1147.228250] sd 0:0:7:0: [sdg] tag#1 Sense Key : Medium Error [current] [ 1147.228438] sd 0:0:7:0: [sdg] tag#1 Add. Sense: Unrecovered read error [ 1147.228622] sd 0:0:7:0: [sdg] tag#1 CDB: Read(10) 28 00 41 84 c7 30 00 00 08 00 [ 1147.228976] blk_update_request: critical medium error, dev sdg, sector 1099220790 [ 1148.975492] sd 0:0:7:0: [sdg] tag#11 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=1s [ 1148.975860] sd 0:0:7:0: [sdg] tag#11 Sense Key : Medium Error [current] [ 1148.976044] sd 0:0:7:0: [sdg] tag#11 Add. Sense: Unrecovered read error [ 1148.976234] sd 0:0:7:0: [sdg] tag#11 CDB: Read(10) 28 00 41 84 c7 30 00 00 08 00 [ 1148.976593] blk_update_request: critical medium error, dev sdg, sector 1099220790 [ 1333.360597] sd 0:0:7:0: [sdg] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=1s [ 1333.360959] sd 0:0:7:0: [sdg] tag#0 Sense Key : Medium Error [current] [ 1333.361150] sd 0:0:7:0: [sdg] tag#0 Add. Sense: Unrecovered read error [ 1333.361338] sd 0:0:7:0: [sdg] tag#0 CDB: Read(10) 28 00 41 84 c7 30 00 00 08 00 [ 1333.361695] blk_update_request: critical medium error, dev sdg, sector 1099220790 [ 1335.091095] sd 0:0:7:0: [sdg] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=1s [ 1335.091457] sd 0:0:7:0: [sdg] tag#0 Sense Key : Medium Error [current] [ 1335.091644] sd 0:0:7:0: [sdg] tag#0 Add. Sense: Unrecovered read error [ 1335.091833] sd 0:0:7:0: [sdg] tag#0 CDB: Read(10) 28 00 41 84 c7 30 00 00 08 00 [ 1335.092193] blk_update_request: critical medium error, dev sdg, sector 1099220790

Konstantin74R commented 2 years ago

My problem was resolved by migration to new stable server, because disk on old server is broken.