apache / couchdb

Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
https://couchdb.apache.org/
Apache License 2.0
6.27k stars 1.03k forks source link

Slow couchdb after many queries #5044

Open luca-simonetti opened 6 months ago

luca-simonetti commented 6 months ago

Description

CouchDB gets tremendously slow after many queries are made. The situation gets better after some pause in the process. But as soon as the queries resume the system gets slow really really fast.

Steps to Reproduce

  1. Make loads of queries
  2. See the time go up really fast

note: we are using couchdb as part of our Hyperledger Fabric Cluster. The queries are then made through blockchain requests.

Expected Behaviour

We expect that the same query takes roughly the same amount of time, regardless of the number of queries in the previous amount of time (let's say the previous 5 minutes)

Your Environment

Additional Context

We're using Hyperledger Fabric. After the service of couchdb is restarted with /etc/init.d/couchdb restart the situation goes back to normal, but it only takes a couple of minutes (which is something like 5 x 5 x 60 = 1500 queries) and the situation starts degrade real quick.

rnewson commented 6 months ago

Your expectation is well-founded, and couchdb does behave that way. So there must be something wrong somewhere. Could it be that the number of in-progress requests keeps climbing? That is, making more and more queries even though previous ones have not yet completed? If so, increased latency on the latter requests is expected due to internal queues.

I suggest using a tool like ApacheBench to confirm or refute this, set it with some modest concurrency and point it at a view with some appropriate query parameters (i.e, don't read the entire view on each request).

Note that a view query operates in two phases. the first phase is to ensure the view is up to date with respect to any database updates (by running any new updates through the map and reduce functions). The second phase is the actual query. It will help us to know if the database in question is being written to (and, if so, how intensely) during these queries. e.g, we would expect a view query to take longer if a large number of updates were made to the database just beforehand.

luca-simonetti commented 6 months ago

Thanks for the reply! I don't think the issue is with queries not being completed, since

a. we do them in series - i.e. we wait for one response before throwing the new query b. we even have a sleep(.5) after each request before throwing another one. c. resources -- namely CPU and memory -- are pretty good (like 10% CPU and 10% memory usage)

from an outside point of view looks like there's some kind of short term memory cache which gets bigger and bigger and each query traverses the whole object which gets bigger and bigger all the time.

for the record: the requests right after the service restart take somethink like few milliseconds, whereas request after some few thousands get even to 120.000 ms (which is the maximum amount of time for us before calling it a day and raising an exception)

we'll try benchmarking with ab

rnewson commented 6 months ago

there is no cache of query results inside couchdb, for sure.

Is the database growing during this? we'd expect longer query times for larger views, even if the view is up-to-date when queried (it's ultimately B+Tree's, so the more data the taller the tree, and the larger the database the more likely we have to read more nodes from disk)

At this point it would help to see your specific map/reduce functions and an example document.

rnewson commented 6 months ago

also check /_active_tasks as view compaction tasks may be running. you can still query while those run, but they'll obviously be using more i/o and perhaps this is a factor in performance (less of an issue on SSD which, hopefully, you are already using).

luca-simonetti commented 6 months ago

no the database is not growing during the read operations (the actual platform is a QA environment where the only operation is the "mega read"). We're speaking of 100.000 reads. Also: right after the restart the whole thing gets back to normal (200ms) and starts degrade again in a couple of minutes.

_active_tasks is empty ([]) I'm not sure what do you mean by our map/reduce since we're using the Hyperledger API to access the CouchDB

I can show you the go code for the chaincode we does the reads though:

func (c *xxxxx) Token(ctx contractapi.TransactionContextInterface, tokenId string) (string, error) {

    token, err := GetTokenData(ctx, tokenId)
    if err != nil {
        return "", err
    }

    tokenxxxxx := xxxxxx
    iterator, err := ctx.GetStub().GetStateByPartialCompositeKey(PrefixTokenxxxxxx, []string{tokenId})
    if err != nil {
        return "", fmt.Errorf("error creating asset chaincode: %v", err)
    }
    for iterator.HasNext() {
        queryResponse, err := iterator.Next()
        if err != nil {
            return "", err
        }
        // Recupero il tokenId
        _, compositeKeyParts, err := ctx.GetStub().SplitCompositeKey(queryResponse.Key)
        if err != nil {
            return "", err
        }

        /// OTHER CODE REDACTED BUT OF NO CONCERN
    }
    _ = iterator.Close()
    resultByte, err := json.Marshal(tokenWithxxxxxxx)
    return string(resultByte), nil
}
rnewson commented 6 months ago

Can you express that as an http request?

luca-simonetti commented 6 months ago

I think that gets translated to two kind of queries:

  1. /xxxxxx_yyyyyy/_all_docs?attachments=true&endkey="\\u0000xxxxx\\u000014302\\uxxxxx􏿿"&include_docs=true&inclusive_end=false&limit=1001&startkey="\\u0000xxxxx\\u000014302\\u0000"
  2. /xxxxxx_yyyyyy/\x00xxxxx\x0014520\x009892299\x00?attachments=true

the first is the one I'm benchmarking now (-c=10 -n=100000) and is sometimes really slow the second seems to take around 10-20ms and is run over the 100 results of the first query

but still I don't understand why if I run the requests one after another and not in parallel they influence each other.

luca-simonetti commented 6 months ago

here's the ab result

Document Length:        47691 bytes

Concurrency Level:      10
Time taken for tests:   2219.996 seconds
Complete requests:      100000
Failed requests:        0
Total transferred:      4792500000 bytes
HTML transferred:       4769100000 bytes
Requests per second:    45.05 [#/sec] (mean)
Time per request:       222.000 [ms] (mean)
Time per request:       22.200 [ms] (mean, across all concurrent requests)
Transfer rate:          2108.19 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       45   54  41.0     52    1089
Processing:   147  168  28.6    163     854
Waiting:       47   57   7.1     56     361
Total:        193  222  51.2    216    1650

Percentage of the requests served within a certain time (ms)
  50%    216
  66%    220
  75%    223
  80%    227
  90%    238
  95%    250
  98%    291
  99%    349
 100%   1650 (longest request)
luca-simonetti commented 6 months ago

browsing through old issues I found this one, which sounds a lot like our problem: https://github.com/apache/couchdb/issues/1319#issuecomment-388409052 at this point: could this be related to how hyperledger manages connections?

rnewson commented 6 months ago

Hi, yes, it could be. A typical http client would reuse connections (indeed, would manage a pool of them), check that you're doing that first. Otherwise you could possibly use up all possible ports and have to wait for one to become available.

Try your ab command again, but change it to run for longer(either more requests or switch to -t) and compare a run with and without the -k parameter, which enables http keepalive. The results with -k should have much more consistent durations.

nickva commented 6 months ago

With include_docs=true tweaking [fabric] all_docs_concurrency = 10 might have an effect.

https://github.com/apache/couchdb/blob/main/src/fabric/src/fabric_view_all_docs.erl#L297

That spawns 10 background workers at the fabric (cluster level) to open 10 documents in parallel, but that puts more work on the cluster. Can try decreasing it or increasing it a bit and see if it changes anything.

luca-simonetti commented 6 months ago

Hi, yes, it could be. A typical http client would reuse connections (indeed, would manage a pool of them), check that you're doing that first. Otherwise you could possibly use up all possible ports and have to wait for one to become available.

Try your ab command again, but change it to run for longer(either more requests or switch to -t) and compare a run with and without the -k parameter, which enables http keepalive. The results with -k should have much more consistent durations.

I tried with -k which yield similar results to the yesterday's test. We don't have much we can do with the connections since we're using CouchDB indirectly, through Hyperledger Fabric.

With include_docs=true tweaking [fabric] all_docs_concurrency = 10 might have an effect.

https://github.com/apache/couchdb/blob/main/src/fabric/src/fabric_view_all_docs.erl#L297

That spawns 10 background workers at the fabric (cluster level) to open 10 documents in parallel, but that puts more work on the cluster. Can try decreasing it or increasing it a bit and see if it changes anything.

we'll definitely try this!! thanks

luca-simonetti commented 6 months ago

With include_docs=true tweaking [fabric] all_docs_concurrency = 10 might have an effect.

https://github.com/apache/couchdb/blob/main/src/fabric/src/fabric_view_all_docs.erl#L297

That spawns 10 background workers at the fabric (cluster level) to open 10 documents in parallel, but that puts more work on the cluster. Can try decreasing it or increasing it a bit and see if it changes anything.

We tried using this, but with no luck. One thing that might be of any help, we see the memory increasing linearly as the requests are made.

This is the memory in the last seven days:

Screenshot 2024-05-06 alle 16 08 14

The points were the memory rapidly decreases are the restarts of the couchdb service, which make the system fast again (for a couple of minutes). I'm really wondering if there's any kind of memory leak somewhere.

nickva commented 6 months ago

@luca-simonetti thanks for trying the concurrency setting.

Yeah, memory leak is plausible we have debugged a number of those last few years. Some in OTP 25 and 24:

Some memory usage increase is expected perhaps if it's the page cache that's using the memory, thought that wouldn't explain the slowdown...

Another thing to tweak might be the max_dbs_open. That helped in another issue related to memory usage: https://github.com/apache/couchdb/discussions/4992, another interesting one might be: https://github.com/apache/couchdb/issues/4988#issuecomment-1955460577. Though, again in those cases performance didn't seem to be much of an issue.

Are there any exceptions or errors showing in the logs?

To get more details about the internals cay try getting the output from /_node/_local/_system and see what the "memory" objects and "message_queues" values increase. https://github.com/apache/couchdb/discussions/4992#discussioncomment-8596562 has an example of calling that and what part of the response might look like. _local can be replaced with an Erlang (distribution) node name as well.

rnewson commented 6 months ago

which version of erlang are you using?

luca-simonetti commented 6 months ago

which version of erlang are you using?

{
    "javascript_engine": {
        "version": "78",
        "name": "spidermonkey"
    },
    "erlang": {
        "version": "24.3.4.15",
        "supported_hashes": [
            "blake2s",
            "blake2b",
            "sha3_512",
            "sha3_384",
            "sha3_256",
            "sha3_224",
            "sha512",
            "sha384",
            "sha256",
            "sha224",
            "sha",
            "ripemd160",
            "md5",
            "md4"
        ]
    },
    "collation_driver": {
        "name": "libicu",
        "library_version": "70.1",
        "collator_version": "153.112",
        "collation_algorithm_version": "14"
    }
}

this is the full _config/_local/_versions output

as for the memory leaks provided: we tried to look into that but those are like 50GB memory leak whereas in our case the memory used is around 200MB and it's only 15% of the total available. For this very reason I don't really think it's related to those problems.

Also: we noticed that the CPU used is around 25% which is really low and tweaking the all_docs_concurrency to 100 didn't change really much meaning that the time taken increases over time, CPU and RAM usages are the same as before.

luca-simonetti commented 6 months ago

and this is the current state of _node/_local/_system it doesn't look too bad I guess...

{
    "uptime": 77078,
    "memory": {
        "other": 26982505,
        "atom": 631001,
        "atom_used": 605053,
        "processes": 170112432,
        "processes_used": 170076040,
        "binary": 474168,
        "code": 13917182,
        "ets": 2267344
    },
    "run_queue": 0,
    "run_queue_dirty_cpu": 0,
    "ets_table_count": 157,
    "context_switches": 851304076,
    "reductions": 3036158123783,
    "garbage_collection_count": 20109323,
    "words_reclaimed": 35778913031,
    "io_input": 144808316,
    "io_output": 1204370753,
    "os_proc_count": 0,
    "stale_proc_count": 0,
    "process_count": 444,
    "process_limit": 262144,
    "message_queues": {
        "couch_file": {
            "50": 0,
            "90": 0,
            "99": 0,
            "count": 38,
            "min": 0,
            "max": 0
        },
        "couch_db_updater": {
            "50": 0,
            "90": 0,
            "99": 0,
            "count": 38,
            "min": 0,
            "max": 0
        },
// a bunch of keys with value:0
    }
luca-simonetti commented 6 months ago

for an extreme test we tried to set [fabric] all_docs_concurrency = 10000 (ten thousands) and it suddendly worked perfectly for the first 8.000 requests. Then it started immediately slow down to the previous point Since we need to get to 200.000 requests we need this behaviour to be constant. With this new information what is the actual problem?

UPDATE: I actually restarted the process to verify that this value actually fixes the problem and it does not. When it was fast it was basically requesting the "container" of the subtokens and somehow somewhere the values were given like super fast. After restarting couchdb it went back to the previous state where it takes the token and then the 100 subdocuments. If you could explain what all_docs_concurrency does, since it's not documented.

UPDATE2: Idk if this is the case, but it could just be related to some hyperledger internal caching. I'm really struggling how the whole system works

nickva commented 6 months ago

for an extreme test we tried to set [fabric] all_docs_concurrency = 10000 (ten thousands) and it suddenly worked perfectly for the first 8.000 requests.

After restarting couchdb it went back to the previous state where it takes the token and then the 100 subdocuments. If you could explain what all_docs_concurrency does, since it's not documented.

All doc_docs concurrency limit how many parallel doc read to do at the cluster level. That speeds up the include_docs=true _all_docs but at the expensive of consuming more resources per request.

Another factor to play with is Q. What is your Q value for the dbs? By default it's 2. You can try experimenting with higher values like 8, 16, 64.

UPDATE2: Idk if this is the case, but it could just be related to some hyperledger internal caching. I'm really struggling how the whole system works

If on your containers your CPU and memory usage are not very high, it does seem like a slow disk problem issue but a disk with a page cache or other limited faster cache in front of it. Try using a faster local disk or another setup as an experiment with the same container and see what numbers you get.

luca-simonetti commented 6 months ago

I don't think the disk is the issue here:

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  0      0 312512 478476 2741660    0    0     2    19    0    0  3  0 97  0  0

as you can see the system doesn't spend any time waiting for IO ops.

What I see though is a memory cache increasing. After restart that very same cache decreases a lot and starts growing back again.

One more note: the requests that are slow are not the "include_documents=true" ones, but the "attachments=true" I reiterate just so we're on the same page: we do 10-20 millions request to download ALL documents with a key prefix in the database. For example we might have piece0000001 piece0000002 ... piece1000000 ... piece9999999

and we make a request for each of those. So if there's something cached it's of no use or even in this case a issue since each subsequent request is accessing a different document from the request before.

Also: we cannot change the Q factor since the database is already created and we don't want to risk resharding and losing the whole thing...

rnewson commented 6 months ago

a memory leak around attachments is feasible, as they are handled as large binaries in erlang, and there have been some leaks in BEAM around that.

What version of erlang is couchdb using here?

rnewson commented 6 months ago

as for resharding, our advice to increase q would be to make a new, empty database with the new q and replicate from current db to the new one with admin credentials and no filter. This ensures everything gets across. You could leave a continuous:true replication running while switching clients over from the old url to the new one, and then cancel it, and delete the old database, once everything is switched (i.e, no more reads or writes arriving at the old db).

luca-simonetti commented 6 months ago

a memory leak around attachments is feasible, as they are handled as large binaries in erlang, and there have been some leaks in BEAM around that.

What version of erlang is couchdb using here?

the Erlang version inside couchdb is: "version": "24.3.4.15"

rnewson commented 6 months ago

ok, let's get you onto 24.3.4.17, as we know there are bugs in earlier point releases of that series. Let's rule those out first.

luca-simonetti commented 6 months ago

ok, cool! How do I do that?

rnewson commented 6 months ago

you'll need to download or build 24.3.4.17 (https://www.erlang.org/patches/otp-24.3.4.17) and build couchdb against it.

How did you install couchdb initially? We should update our binary packages with 24.3.4.17 for the same reason.

luca-simonetti commented 6 months ago

we installed the current couchdb version (3.3..3) using the official repository version for Ubuntu. I think erlang came bundled with it

nickva commented 6 months ago

I just rebuilt all the 3.3.3 packages to use the latest Erlang patch version 24.3.4.17. If you update from the deb repo you should get the latest version from there: 3.3.3-2

luca-simonetti commented 6 months ago

thank you for the help! We updated couchdb as requested:

{
"javascript_engine": {
"version": "78",
"name": "spidermonkey"
},
"erlang": {
"version": "24.3.4.17",
"supported_hashes": [
"blake2s",
"blake2b",
"sha3_512",
"sha3_384",
"sha3_256",
"sha3_224",
"sha512",
"sha384",
"sha256",
"sha224",
"sha",
"ripemd160",
"md5",
"md4"
]
},
"collation_driver": {
"name": "libicu",
"library_version": "70.1",
"collator_version": "153.112",
"collation_algorithm_version": "14"
}
}

but unfortunately didn't help. The problem is still the same 😢

tiju86 commented 3 months ago

hello @luca-simonetti , have you been able to make progress on this problem? we have exactly the same thing on a couchDB 3.3.3. no particular load on the server, only restarting the couchDB service solves the problem.

thanks

luca-simonetti commented 3 months ago

no we changed which API from couchdb we call. Instead of calling each single attachment, we call the doc with include_attachmets=true and that does the trick. One possible issue is that the http client is not using the right configuration (see above). Our problem is that we're using hyperledger which cannot be changed for now (we need to stick to a specific version) and as such we won't investigate further. But I guess it's important to continue on this issue anyways. Glad to help if I can manage

rnewson commented 3 months ago

one note about an earlier comment, "What I see though is a memory cache increasing. After restart that very same cache decreases a lot and starts growing back again.", this is referring to the kernels disk page cache and it is a good thing (critical for performance), and definitely not the sign of a problem. that it goes away when you restart makes sense, as this invalidates the cache.

luca-simonetti commented 3 months ago

one note about an earlier comment, "What I see though is a memory cache increasing. After restart that very same cache decreases a lot and starts growing back again.", this is referring to the kernels disk page cache and it is a good thing (critical for performance), and definitely not the sign of a problem. that it goes away when you restart makes sense, as this invalidates the cache.

the thing is that when we restart the couch service the performance goes back to normal. In this case the first thing that comes to my mind is some cache somewhere. Not necessarily as part of couch itself but also as part of the underlying OS or something else.

tiju86 commented 3 months ago

no we changed which API from couchdb we call. Instead of calling each single attachment, we call the doc with include_attachmets=true and that does the trick. One possible issue is that the http client is not using the right configuration (see above). Our problem is that we're using hyperledger which cannot be changed for now (we need to stick to a specific version) and as such we won't investigate further. But I guess it's important to continue on this issue anyways. Glad to help if I can manage

thanks for your answer i will study this new idea because we also use attachments, there may be a problem there

tiju86 commented 3 months ago

I checked the management of attachments and they are retrieved in batches and not one by one. on our side couchdb is slow for all requests (even to display the list of databases), after restarting the service everything is ok

how can we see our version of erlang?

luca-simonetti commented 3 months ago

visit this /_node/_local/_versions after you login on Fauxton

tiju86 commented 3 months ago

thanks, so we have the version 24.3.4.14

{
  "javascript_engine": {
    "version": "78",
    "name": "spidermonkey"
  },
  "erlang": {
    "version": "24.3.4.14",
    "supported_hashes": [
      "sha",
      "sha224",
      "sha256",
      "sha384",
      "sha512",
      "sha3_224",
      "sha3_256",
      "sha3_384",
      "sha3_512",
      "blake2b",
      "blake2s",
      "md4",
      "md5",
      "ripemd160"
    ]
  },
  "collation_driver": {
    "name": "libicu",
    "library_version": "67.1",
    "collator_version": "153.104",
    "collation_algorithm_version": "13"
  }
}

the os is DEBIAN, the detail of package :

 dpkg -s couchdb
Package: couchdb
Status: install ok installed
Priority: optional
Section: misc
Installed-Size: 52954
Maintainer: CouchDB Developers <dev@couchdb.apache.org>
Architecture: amd64
Version: 3.3.3~bullseye
Replaces: couchdb-bin, couchdb-common
Depends: adduser, curl, debconf, init-system-helpers, libmozjs-78-0 (>= 78.4.0), lsb-base, procps, debconf (>= 0.5) | debconf-2.0, libc6 (>= 2.29), libgcc-s1 (>= 3.4), libicu67 (>= 67.1-1~), libssl1.1 (>= 1.1.1), libstdc++6 (>= 9), libtinfo6 (>= 6)
Breaks: couchdb-bin, couchdb-common
Conffiles:
 /opt/couchdb/etc/local.ini 5fd760551ab5f2f448107a253b0a7cfd
 /opt/couchdb/etc/vm.args c6e59e9c95f3323136267d8fc44fe0ac
 /etc/couchdb/NO_LONGER_USED_MOVED_TO_opt_couchdb_etc d41d8cd98f00b204e9800998ecf8427e
 /etc/default/couchdb cbed9de0f2dfa9ebd8dfa5f0d6c92ed1
 /etc/init.d/couchdb 00879038de10ac9d49cc5cd73e603355
 /etc/logrotate.d/couchdb 7387f7c2b461031ec9f1858326e485ec
tiju86 commented 3 months ago

i see there are two new versions of packages 3.3.3-1 and 3.3.3-2 how to know if there is an erlang update in it?


couchdb:
  Installé : 3.3.3~bullseye
  Candidat : 3.3.3~bullseye
 Table de version :
     3.3.3-2~bullseye 990
        990 https://apache.jfrog.io/artifactory/couchdb-deb bullseye/main amd64 Packages
     3.3.3-1~bullseye 990
        990 https://apache.jfrog.io/artifactory/couchdb-deb bullseye/main amd64 Packages
 *** 3.3.3~bullseye 1200
        990 https://apache.jfrog.io/artifactory/couchdb-deb bullseye/main amd64 Packages
        100 /var/lib/dpkg/status
tiju86 commented 2 months ago

hello,

We have installed the package 3.3.3-2~bullseye 990 which contains the latest version of erlang (24.3.4.17) but unfortunately we still have the same problems.

for example the graph of this morning :

image

jungalwala commented 3 weeks ago

I am having the same issue, but I don't do anything with attachments.