apache / couchdb-nano

Nano: The official Apache CouchDB library for Node.js
https://www.npmjs.com/package/nano
Apache License 2.0
651 stars 165 forks source link

Support over 1k changes feed listeners #267

Closed BrodaUa closed 3 years ago

BrodaUa commented 3 years ago

Hello. I have a question regarding Nano performance: Is it possible to use Reading Changes Feed (aka nano.changesReader) to track the changes in parallel for a large number (1k...10k) of databases in a single CouchDB instance? Currently I see the big performance slowdown, as it takes minutes (10-20 min) to pick-up the latest change in any db. Thanks.

Expected Behavior

No performance issues in case of 1k parallel changes feed listeners.

Current Behavior

It takes minutes to pick-up the latest change in any db

Steps to Reproduce (for bugs)

  1. create 1k dbs
  2. create 1k EventEmitter's via
    nano.use(dbName).changesReader
                .start({ includeDocs: true })
                .on("change", changes => {
                    console.log(`change in ${dbName}: `, changes);
                })
                .on('error', (error) => {
                    console.log(` error in ${dbName}`, error);
                });

Context

Related use-case - messaging app. Each user has a message db, with outgoing and incoming messages. I want to track the changes when I send a message to somebody (here the message stored in db via PouchDB - CouchDB sync process), lookup the destination and write this message in destination user db.

Your Environment

glynnbird commented 3 years ago

There's a number of things going on here:

In short, if you're relying on this mechanism to scale up your application (e.g. one changes-feed/database per user) then eventually you will run out of CouchDB capacity. What that number is (100, 1k, 10k, 100k?) is difficult to estimate theoretically but you should be able to run experiments to see when performance tails off. If you have a fixed number of users (e.g. employees in shop) then you might be ok - if you expect numbers to increase over time (like a social network), then you might have to go back to the drawing board.

BrodaUa commented 3 years ago

@glynnbird Thank you for the extensive answer. As of now, it is anticipated to have up to 10k users. Indeed, maxSockets can be a bottle neck. I missed that cause official nano docs says: By default, the Node.js HTTP global agent has a infinite number of active connections that can run simultaneously. I tried to modify the HttpAgent with maxSockets=32k, but now I get an issue with CouchDB: No DB shards could be opened. I have the next CouchDB custom config

[cluster]
q=4
[httpd]
enable_cors=true
[cors]
origins=*
[chttpd]
server_options=[{backlog, 512}, {acceptor_pool_size, 16}, {max, 32768}]
[couchdb]
max_dbs_open=32768

I run the db as a docker image https://hub.docker.com/_/couchdb/ via docker-compose, also tried to tune the system file descriptors

services:
  couchdb:
    image: couchdb:3.1
    ulimits: 
      nofile: 128000

but it has no effect, still the same shards issue. Although, when I deacrease the maxSockets=128, there is no shards issue, Do you have any idea why maxSockets causes it?