Open SourceR85 opened 1 month ago
enospc from no match of right hand value {error,enospc}
indicates we're probably running out of disk space [1]
It should be a more friendly message in the log, but at least first sight that's what's jumping out.
enospc from
no match of right hand value {error,enospc}
indicates we're probably running out of disk space
~That's not a problem...~ I have 799.7 GB of 2TB free (the DB I replicate is 86.1GB)
Is there any chance view directory is configured to write another disk or the disks may fail to mount and it ends up writting to the root file system. enospc
is usually a transparent passthrough error from the FS layer.
The first instance in the logs seem to come from writting an attachments:
gen,do_call,4,[{file,"gen.erl"},{line,237}]},{gen_server,call,3,[{file,"gen_server.erl"},{line,381}]},
{couch_att,write_streamed_attachment,3,
Is there a way to reconfigure the data directory or point it to another volume? Or tests if you can write to it manually? Verify that indeed the data directory is pointing the mounted large volume, sometimes misconfigurations happen and I've seen writes going to another directory than the indentded one.
As you expect: the docker volume got stuck...
Can't write content into data (just touch file
works)
This is my docker deployment (secrets removed) couchdb.tar.gz There's nothing fancy in it, as far as I can say...
Can't write content into data (just touch file works)
That would explain it, I think. Good find. It's sneaky that touch
works though.
Just for curiosity, I stopped the container, rm & created couchdb-data and started the replication again: same result...
[notice] 2024-09-30T16:14:19.553744Z nonode@nohost <0.14636.101> -------- Retrying POST request to http://localhost:5984/hzd/_bulk_docs in 4.0 seconds due to error {code,500}
[error] 2024-09-30T16:14:19.574327Z nonode@nohost <0.16657.101> d5dfe20e02 rexi_server: from: nonode@nohost(<0.19120.101>) mfa: fabric_rpc:update_docs/3 exit:{{badmatch,{error,enospc}},[{couch_bt_engine,write_doc_body,2,[{file,"src/couch_bt_engine.erl"},{line,439}]},{couch_db_updater,'-flush_trees/3-fun-0-',6,[{file,"src/couch_db_updater.erl"},{line,384}]},{couch_key_tree,mapfold_simple,4,[{file,"src/couch_key_tree.erl"},{line,464}]},{couch_key_tree,mapfold_simple,4,[{file,"src/couch_key_tree.erl"},{line,473}]},{couch_key_tree,mapfold,3,[{file,"src/couch_key_tree.erl"},{line,457}]},{couch_db_updater,flush_trees,3,[{file,"src/couch_db_updater.erl"},{line,373}]},{couch_db_updater,update_docs_int,4,[{file,"src/couch_db_updater.erl"},{line,718}]},{couch_db_updater,handle_info,2,[{file,"src/couch_db_updater.erl"},{line,183}]}]} [{couch_db,collect_results,3,[{file,"src/couch_db.erl"},{line,1457}]},{couch_db,collect_results_with_metrics,3,[{file,"src/couch_db.erl"},{line,1439}]},{couch_db,write_and_commit,4,[{file,"src/couch_db.erl"},{line,1471}]},{couch_db,update_docs,4,[{file,"src/couch_db.erl"},{line,1333}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,360}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
[info] 2024-09-30T16:14:19.574423Z nonode@nohost <0.243.0> -------- db shards/e0000000-ffffffff/hzd.1727710380 died with reason {{badmatch,{error,enospc}},[{couch_bt_engine,write_doc_body,2,[{file,"src/couch_bt_engine.erl"},{line,439}]},{couch_db_updater,'-flush_trees/3-fun-0-',6,[{file,"src/couch_db_updater.erl"},{line,384}]},{couch_key_tree,mapfold_simple,4,[{file,"src/couch_key_tree.erl"},{line,464}]},{couch_key_tree,mapfold_simple,4,[{file,"src/couch_key_tree.erl"},{line,473}]},{couch_key_tree,mapfold,3,[{file,"src/couch_key_tree.erl"},{line,457}]},{couch_db_updater,flush_trees,3,[{file,"src/couch_db_updater.erl"},{line,373}]},{couch_db_updater,update_docs_int,4,[{file,"src/couch_db_updater.erl"},{line,718}]},{couch_db_updater,handle_info,2,[{file,"src/couch_db_updater.erl"},{line,183}]}]}
[error] 2024-09-30T16:14:19.574887Z nonode@nohost <0.18010.101> -------- gen_server <0.18010.101> terminated with reason: no match of right hand value {error,enospc} at couch_bt_engine:write_doc_body/2(line:439) <= couch_db_updater:'-flush_trees/3-fun-0-'/6(line:384) <= couch_key_tree:mapfold_simple/4(line:464) <= couch_key_tree:mapfold_simple/4(line:473) <= couch_key_tree:mapfold/3(line:457) <= couch_db_updater:flush_trees/3(line:373) <= couch_db_updater:update_docs_int/4(line:718) <= couch_db_updater:handle_info/2(line:183)
last msg: redacted
state: {db,1,<<"shards/e0000000-ffffffff/hzd.1727710380">>,"./data/shards/e0000000-ffffffff/hzd.1727710380.couch",{couch_bt_engine,{st,"./data/shards/e0000000-ffffffff/hzd.1727710380.couch",<0.19406.101>,#Ref<0.3603940510.502005771.203208>,undefined,{db_header,8,30406,0,{9450247660,{29670,687,{size_info,9279630171,9278136634}},12600491},{9450249167,30357,11927090},{9448039553,[],2388},nil,nil,4251,1000,<<"2719778795232e78e860e5e8ab70c794">>,[{nonode@nohost,0}],0,1000,0},false,{btree,<0.19406.101>,{9450247660,{29670,687,{size_info,9279630171,9278136634}},12600491},fun couch_bt_engine:id_tree_split/1,fun couch_bt_engine:id_tree_join/2,undefined,fun couch_bt_engine:id_tree_reduce/2,snappy},{btree,<0.19406.101>,{9450249167,30357,11927090},fun couch_bt_engine:seq_tree_split/1,fun couch_bt_engine:seq_tree_join/2,undefined,fun couch_bt_engine:seq_tree_reduce/2,snappy},{btree,<0.19406.101>,{9448039553,[],2388},fun couch_bt_engine:local_tree_split/1,fun couch_bt_engine:local_tree_join/2,undefined,nil,snappy},snappy,{btree,<0.19406.101>,nil,fun couch_bt_engine:purge_tree_split/1,fun couch_bt_engine:purge_tree_join/2,undefined,fun couch_bt_engine:purge_tree_reduce/2,snappy},{btree,<0.19406.101>,nil,fun couch_bt_engine:purge_seq_tree_split/1,fun couch_bt_engine:purge_seq_tree_join/2,undefined,fun couch_bt_engine:purge_tree_reduce/2,snappy}}},<0.18010.101>,nil,30406,<<"1727712856444764">>,{user_ctx,null,[],undefined},[{<<"members">>,{[{<<"roles">>,[<<"_admin">>]}]}},{<<"admins">>,{[{<<"roles">>,[<<"_admin">>]}]}}],[#Fun<couch_doc.7.91987333>],nil,nil,undefined,[{default_security_object,[{<<"members">>,{[{<<"roles">>,[<<"_admin">>]}]}},{<<"admins">>,{[{<<"roles">>,[<<"_admin">>]}]}}]},replicated_changes,{user_ctx,{user_ctx,<<"groot">>,[<<"_admin">>],<<"cookie">>}},{w,"1"},{props,[{partitioned,true},{hash,[couch_partition,hash,[]]}]}],undefined}
extra: []
[notice] 2024-09-30T16:14:19.574938Z nonode@nohost <0.19120.101> d5dfe20e02 localhost:5984 127.0.0.1 groot POST /hzd/_bulk_docs 500 ok 21
[error] 2024-09-30T16:14:19.575102Z nonode@nohost <0.18010.101> -------- gen_server <0.18010.101> terminated with reason: no match of right hand value {error,enospc} at couch_bt_engine:write_doc_body/2(line:439) <= couch_db_updater:'-flush_trees/3-fun-0-'/6(line:384) <= couch_key_tree:mapfold_simple/4(line:464) <= couch_key_tree:mapfold_simple/4(line:473) <= couch_key_tree:mapfold/3(line:457) <= couch_db_updater:flush_trees/3(line:373) <= couch_db_updater:update_docs_int/4(line:718) <= couch_db_updater:handle_info/2(line:183)
last msg: redacted
state: {db,1,<<"shards/e0000000-ffffffff/hzd.1727710380">>,"./data/shards/e0000000-ffffffff/hzd.1727710380.couch",{couch_bt_engine,{st,"./data/shards/e0000000-ffffffff/hzd.1727710380.couch",<0.19406.101>,#Ref<0.3603940510.502005771.203208>,undefined,{db_header,8,30406,0,{9450247660,{29670,687,{size_info,9279630171,9278136634}},12600491},{9450249167,30357,11927090},{9448039553,[],2388},nil,nil,4251,1000,<<"2719778795232e78e860e5e8ab70c794">>,[{nonode@nohost,0}],0,1000,0},false,{btree,<0.19406.101>,{9450247660,{29670,687,{size_info,9279630171,9278136634}},12600491},fun couch_bt_engine:id_tree_split/1,fun couch_bt_engine:id_tree_join/2,undefined,fun couch_bt_engine:id_tree_reduce/2,snappy},{btree,<0.19406.101>,{9450249167,30357,11927090},fun couch_bt_engine:seq_tree_split/1,fun couch_bt_engine:seq_tree_join/2,undefined,fun couch_bt_engine:seq_tree_reduce/2,snappy},{btree,<0.19406.101>,{9448039553,[],2388},fun couch_bt_engine:local_tree_split/1,fun couch_bt_engine:local_tree_join/2,undefined,nil,snappy},snappy,{btree,<0.19406.101>,nil,fun couch_bt_engine:purge_tree_split/1,fun couch_bt_engine:purge_tree_join/2,undefined,fun couch_bt_engine:purge_tree_reduce/2,snappy},{btree,<0.19406.101>,nil,fun couch_bt_engine:purge_seq_tree_split/1,fun couch_bt_engine:purge_seq_tree_join/2,undefined,fun couch_bt_engine:purge_tree_reduce/2,snappy}}},<0.18010.101>,nil,30406,<<"1727712856444764">>,{user_ctx,null,[],undefined},[{<<"members">>,{[{<<"roles">>,[<<"_admin">>]}]}},{<<"admins">>,{[{<<"roles">>,[<<"_admin">>]}]}}],[#Fun<couch_doc.7.91987333>],nil,nil,undefined,[{default_security_object,[{<<"members">>,{[{<<"roles">>,[<<"_admin">>]}]}},{<<"admins">>,{[{<<"roles">>,[<<"_admin">>]}]}}]},replicated_changes,{user_ctx,{user_ctx,<<"groot">>,[<<"_admin">>],<<"cookie">>}},{w,"1"},{props,[{partitioned,true},{hash,[couch_partition,hash,[]]}]}],undefined}
extra: []
[error] 2024-09-30T16:14:19.575128Z nonode@nohost <0.14636.101> -------- Replicator, request POST to "http://localhost:5984/hzd/_bulk_docs" failed due to error {code,500}
[error] 2024-09-30T16:14:19.575198Z nonode@nohost <0.18010.101> -------- CRASH REPORT Process (<0.18010.101>) with 0 neighbors crashed with reason: no match of right hand value {error,enospc} at couch_bt_engine:write_doc_body/2(line:439) <= couch_db_updater:'-flush_trees/3-fun-0-'/6(line:384) <= couch_key_tree:mapfold_simple/4(line:464) <= couch_key_tree:mapfold_simple/4(line:473)
My fault: I'm using Docker Desktop, the max. storage capacity was globally set to 100GB and the source (CouchDB 3.3.3) is running in parallel, so I can replicate from it... My assumption was, that I'm running docker without limits.
So nickva spotted it right on his first comment:
enospc from
no match of right hand value {error,enospc}
indicates we're probably running out of disk space [1]It should be a more friendly message in the log, but at least first sight that's what's jumping out.
There may be two ideas for improvement, that I can provide from my fault:
No worries at all, thanks for reaching out.
Yeah, agree a more friendly error would be nice in the logs.
And it turns out we do have a disk monitor now in 3.4 (the work of @rnewson)!
https://docs.couchdb.org/en/stable/config/disk-monitor.html if you configure it, it will stop indexing when approaching the limit and return a meaningful API error.
See https://github.com/apache/couchdb/pull/4681 for the PR comments and the implementation.
Description
I've set up a fresh CouchDB 3.4.1 instance (as Docker image, build from https://github.com/apache/couchdb-docker/tree/main/3.4.1) Then I've started a replication from prod.-server and saw endless messages of "no match of right hand value {error,enospc}"
Here a (truncated) copy of the docker log: couchdb.tar.gz
Your Environment
CouchDB version used: version: 3.4.1
Operating system and version: Fedora Linux 40 (KDE Plasma) and Ubuntu Server 24 (Both running the same docker image and report the same error)
Additional Context
I've talked a bit with Jan at slack, his first thoughts: https://app.slack.com/client/T49P1AZRT/C49LEE7NW