apache / couchdb

Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
https://couchdb.apache.org/
Apache License 2.0
6.23k stars 1.03k forks source link

Smoosh never triggered with conflicted documents #3410

Closed wknd closed 11 months ago

wknd commented 3 years ago

Autocompaction is never triggered in certain situations causing unusually large disk usage.

Description

When updating documents smoosh triggers at predictable intervals and cleans up the old files. However, when the new data comes in via replication (pouchDB) this does not seem to trigger, or at least not in all cases. Manually triggering compaction with curl does work. This happened on a production server, so the exact cause is a bit unclear to me. But I was able to reproduce it on a fresh docker container using curl.

Steps to Reproduce

Set up some fresh install of couchdb, set a password and create a userdb. Whatever it is you usually do.
create a database and add a document to it 'testtest/8029e17efe934779a44c54c7050006ec' will be used in my examples. In the document I set a random property with an extremely large string to be able to trigger it faster.

I use this curl config to make things a bit easier:

compressed
cookie = "cookie_couch.txt"
cookie-jar = "cookie_couch.txt"
header = "Content-Type: application/json"
header = "Accept: application/json"
write-out = "\n"

login on your couchdb server:

curl -X POST http://localhost:5984/_session -K curl-proxyauth.conf \
-d '{"name": "admin", "password": "adminPassword"}'

Proof smoosh triggers when expected:

this script will get the related document, then put the same one. it'll return the sizes.active, sizes.file, the ratio. it also returns a boolean to represent when default settings should trigger compaction.

db="testtest";
doc="8029e17efe934779a44c54c7050006ec";
while true; do
  curl -s -X GET http://localhost:5984/$db/$doc -K curl-proxyauth.conf | curl -s -X PUT http://localhost:5984/$db/$doc -K curl-proxyauth.conf -d @- > /dev/null;
  str=`curl -s -X GET http://localhost:5984/testtest -K curl-proxyauth.conf | jq '[.sizes.file, .sizes.active, .sizes.file/.sizes.active, .sizes.file/.sizes.active > 2, .sizes.file - .sizes.active >= 16777216]'`
  echo $str
  if [[ $str =~ "true" ]]; then
    break
  fi
done;
# ask and print once more
curl -s -X GET http://localhost:5984/$db -K curl-proxyauth.conf | jq '.sizes';

you should see the sizes.file increase constantly until the script stops. After which auto compaction will trigger (or because of rounding errors, it could be exactly 1 write later)

attempting to create document conflicts, not triggering smoosh

I wasn't exactly sure how to intentionally create conflicts, but this seemed to at the very least demonstrate the problem. It is similar to the above aproach, except we're not going to stop after reaching the threshold.

db="testtest";
doc="8029e17efe934779a44c54c7050006ec";
originalrev=`curl -s -X GET http://localhost:5984/$db/$doc -K curl-proxyauth.conf | jq -r '._rev'`;
echo "creating branch on $originalrev";
# create update on correct branch
curl -s -X GET http://localhost:5984/$db/$doc -K curl-proxyauth.conf | curl -s -X PUT http://localhost:5984/$db/$doc -K curl-proxyauth.conf -d @- > /dev/null;
while true; do

  curl -s -X GET http://localhost:5984/$db/$doc -K curl-proxyauth.conf | jq -r --arg REV $originalrev '{ _id: ._id, extraTestData: .extraTestData, _rev: $REV}' | curl -s -X PUT http://localhost:5984/$db/$doc?new_edits=false -K curl-proxyauth.conf -d @- > /dev/null
  str=`curl -s -X GET http://localhost:5984/testtest -K curl-proxyauth.conf | jq '[.sizes.file, .sizes.active, .sizes.file/.sizes.active, .sizes.file/.sizes.active > 2, .sizes.file - .sizes.active >= 16777216]'`
  echo $str
  #if [[ $str =~ "true" ]]; then
  #  break
  #fi
done;
# ask and print once more
curl -s -X GET http://localhost:5984/$db -K curl-proxyauth.conf | jq '.sizes';

(if you do want to automatically stop at the right time, uncomment the if statement)

Expected Behaviour

expect compaction to start after reaching a default ratio of 2. It doesn't. But it will trigger if you run the first script again for a while.. or you can manually compact data.

Your Environment

Additional Context

In the real world, this was happening with just a few clients (that may be running an old version of our software and aren't updating..) constantly spamming the server with invalid or conflicted documents.

We'd get an alert about excessive disk usage. I'd identify the evil user and compact his database.

This temporarily resolves the server issue but it'd be better of auto compaction did it for me.
(and even better if those clients answered their emails and followed the provided steps to stop them sending the evil data in the first place.. but thats out of my control)

wohali commented 3 years ago

Can you share, for the database you believe isn't being compacted correctly, the per-shard info? You can retrieve this with, for a given node, GET /_node/<node-name>/_dbs to find the shard names, then GET /_node/<node-name>/<shard-name>

What we want is the sizes object: "sizes":{"file":49361,"external":2379,"active":3796}

It would also help to turn logging up on your CouchDB instance to at least notice level, and show the debug output from Smoosh, as logged here. This will tell us if smoosh is trying to compact your databases or not.

wknd commented 3 years ago

I can't seem to find the per shard info you describe or find the documentation to get the right syntax.

$ curl -s -X GET http://localhost:5984/testtest/_shards -K curl-proxyauth.conf  | jq
{  "shards": {    "00000000-7fffffff": [
      "nonode@nohost"
    ],
    "80000000-ffffffff": [
      "nonode@nohost"
    ]
  }
}
$ curl -s -X GET http://localhost:5984/_node/nonode@nohost/_dbs -K curl-proxyauth.conf  | jq
{
  "db_name": "_dbs",
  "engine": "couch_bt_engine",
  "doc_count": 8,
  "doc_del_count": 0,
  "update_seq": 8,
  "purge_seq": 0,
  "compact_running": false,
  "sizes": {
    "active": 2660,
    "external": 2584,
    "file": 37069
  },
  "instance_start_time": "1615397397994629",
  "disk_format_version": 8,
  "committed_update_seq": 8,
  "compacted_seq": 0,
  "props": {},
  "uuid": "09db1ca01262f5ee27d72711c63f182c"
}
$ curl -s -X GET http://localhost:5984/_node/nonode@nohost/00000000-7fffffff -K curl-proxyauth.conf  | jq
{
  "error": "illegal_database_name",
  "reason": "Name: '00000000-7fffffff'. Only lowercase characters (a-z), digits (0-9), and any of the characters _, $, (, ), +, -, and / are allowed. Must begin with a letter."
}
$ curl -s -X GET 'http://localhost:5984/_node/nonode@nohost/testtest.1615396353' -K curl-proxyauth.conf  | jq
{
  "error": "not_found",
  "reason": "no_db_file"
}
$ curl -s -X GET 'http://localhost:5984/_node/nonode@nohost/testtest.1615396353.couch' -K curl-proxyauth.conf  | jq
{
  "error": "not_found",
  "reason": "no_db_file"
}

what file does it expect there?

The smoosh logs you're looking for only occur with testscript 1:

touch2-couchdb | [notice] 2021-03-10T17:02:26.300395Z nonode@nohost <0.430.0> -------- ratio_dbs: adding <<"shards/00000000-7fffffff/testtest.1615393017">> to internal compactor queue with priority 16.02983105341381
touch2-couchdb | [notice] 2021-03-10T17:02:26.300465Z nonode@nohost <0.430.0> -------- ratio_dbs: Starting compaction for shards/00000000-7fffffff/testtest.1615393017 (priority 16.02983105341381)
touch2-couchdb | [notice] 2021-03-10T17:02:26.300706Z nonode@nohost <0.430.0> -------- ratio_dbs: Started compaction for shards/00000000-7fffffff/testtest.1615393017

(though todays test seems to reach a higher priority than yesterday even though I started over from a clean image for further testing)

with testscript 2 the smoosh log never emits, and db sizes when I gave up were:

$ curl -s -X GET http://localhost:5984/testtest -K curl-proxyauth.conf | jq
{  "db_name": "testtest",  "purge_seq": "0-g1AAAABPeJzLYWBgYMpgTmHgzcvPy09JdcjLz8gvLskBCeexAEmGBiD1HwiyEhlwqEtkSKqHKMgCAIT2GV4",  "update_seq": "767-g1AAAABSeJzLYWBgYMpgTmHgzcvPy09JdcjLz8gvLskBCeexAEmGBiD1HwiyEhlwqEtkSKoHKUgCsv9nAQDR3Bpg",
  "sizes": {
    "file": 3277164,
    "external": 16637,
    "active": 40239
  },
  "props": {},
  "doc_del_count": 0,
  "doc_count": 1,
  "disk_format_version": 8,
  "compact_running": false,
  "cluster": {
    "q": 2,
    "n": 1,
    "w": 1,
    "r": 1
  },
  "instance_start_time": "0"
}

Note there is almost nothing on this server for testing, a few empty databases and 1 'testtest' database with just 1 document in it. I'm creating a new docker container just for these tests.

wohali commented 3 years ago

Sorry, I rushed my advice, here's an example on a local DB:

$ curl http://admin:password@localhost:5984/_membership
{"all_nodes":["couchdb@localhost"],"cluster_nodes":["couchdb@localhost"]}

$ curl http://admin:password@localhost:5984/_node/couchdb@localhost/_dbs/test1
{"_id":"test1","_rev":"1-24ad69872d6cdaa9a85708b327ba887d","shard_suffix":[46,49,54,48,50,49,55,48,50,54,48],"changelog":[["add","00000000-7fffffff","couchdb@localhost"],["add","80000000-ffffffff","couchdb@localhost"]],"by_node":{"couchdb@localhost":["00000000-7fffffff","80000000-ffffffff"]},"by_range":{"00000000-7fffffff":["couchdb@localhost"],"80000000-ffffffff":["couchdb@localhost"]},"props":{}}

$ curl http://admin:password@localhost:5984/_node/couchdb@localhost/_all_dbs
... cut for privacy...
"shards/00000000-7fffffff/test1.1602170260",
"shards/80000000-ffffffff/test1.1602170260"

$ curl http://admin:password@localhost:5984/_node/couchdb@localhost/shards%2f00000000-7fffffff%2ftest1.1602170260
{"db_name":"shards/00000000-7fffffff/test1.1602170260","engine":"couch_bt_engine","doc_count":0,"doc_del_count":0,"update_seq":0,"purge_seq":0,"compact_running":false,"sizes":{"active":333,"external":0,"file":16549},"instance_start_time":"1615404245595000","disk_format_version":8,"committed_update_seq":0,"compacted_seq":0,"props":{},"uuid":"eefcf4e13cf56f1f5bd063afce077672"}

And repeat for 80-ff range. So for 00-7f for this DB the info is {"active":333,"external":0,"file":16549}.

wknd commented 3 years ago

No problem, got the data for you:

$ curl -s -X GET http://localhost:5984/_node/nonode@nohost/_all_dbs -K curl-proxyauth.conf | jq
[
  "_dbs",
  "_nodes",
  "_users",
  "shards/00000000-7fffffff/_users.1615396319",
  "shards/00000000-7fffffff/testtest.1615396353",
  ... snip snip...
  "shards/80000000-ffffffff/_users.1615396319",
  "shards/80000000-ffffffff/testtest.1615396353",
  ... snip snip ...
]

$ curl -s -X GET http://localhost:5984/_node/nonode@nohost/shards%2f00000000-7fffffff%2ftesttest.1615396353 -K curl-proxyauth.conf | jq
{
  "db_name": "shards/00000000-7fffffff/testtest.1615396353",
  "engine": "couch_bt_engine",
  "doc_count": 1,
  "doc_del_count": 0,
  "update_seq": 853,
  "purge_seq": 0,
  "compact_running": false,
  "sizes": {
    "active": 34963,
    "external": 96,
    "file": 2887884
  },
  "instance_start_time": "1615408498456965",
  "disk_format_version": 8,
  "committed_update_seq": 853,
  "compacted_seq": 837,
  "props": {},
  "uuid": "d16cdf0c8f8c65db7902499ee4857dc9"
}
$ curl -s -X GET http://localhost:5984/_node/nonode@nohost/shards%2f80000000-ffffffff%2ftesttest.1615396353 -K curl-proxyauth.conf | jq
{
  "db_name": "shards/80000000-ffffffff/testtest.1615396353",
  "engine": "couch_bt_engine",
  "doc_count": 0,
  "doc_del_count": 0,
  "update_seq": 0,
  "purge_seq": 0,
  "compact_running": false,
  "sizes": {
    "active": 0,
    "external": 0,
    "file": 8346
  },
  "instance_start_time": "1615408546533771",
  "disk_format_version": 8,
  "committed_update_seq": 0,
  "compacted_seq": 0,
  "props": {},
  "uuid": "b5d43e1cb89a57b4234877546b7bcd16"
}
wohali commented 3 years ago

@wknd is this for your test scenario 1 or 2?

@rnewson thoughts on the above? ratio seems to trigger but I'm not seeing the compaction.

wknd commented 3 years ago

Those logs were after running scenario 2. Scenario 1 it does seem to trigger it as expected.

nickva commented 11 months ago

This should be fixed with https://github.com/apache/couchdb/pull/4264. There we fixed an bug where the active db size calculation was mistakenly calculated using non-leaf nodes + leaf node, while it should be leaf nodes only.