medic / cht-core

The CHT Core Framework makes it faster to build responsive, offline-first digital health apps that equip health workers to provide better care in their communities. It is a central resource of the Community Health Toolkit.
https://communityhealthtoolkit.org
GNU Affero General Public License v3.0
469 stars 217 forks source link

Bump changes_doc_ids_optimization_threshold for Couch 3.4 #9642

Open dianabarsan opened 1 week ago

dianabarsan commented 1 week ago

Describe the performance issue CouchDb 3.4 introduces an "optimization" where the changes feed with doc_ids retrieves targeted docs only when the payload is under 1000 doc_ids, and goes over the whole changes feed when it's over 1000. Previously, there was no limit. This makes purging and other mechanisms that rely on querying changes with doc ids be very slow.

Describe the improvement you'd like Update purging so it hits other endpoints or work out a way to optimize it while still using the changes feed.

Measurements We should get similar purging times on Couch 3.3 and Couch 3.4.

Additional context

https://github.com/medic/cht-core/issues/9303#issuecomment-2473165284

dianabarsan commented 6 days ago

I've tried this over a local database with 100k docs, and these are the numbers my purge times ended up with:

CouchDb v. Method Time
v. 3.3.3 _changes 5.3 minutes
v. 3.4.2 _changes 11 minutes
v 3.4.2 _all_docs 18 minutes
v. 3.4.2 _changes with increased changes_doc_ids_optimization_threshold 5.5 minutes

So it turned out using _all_docs instead of changes requests is even worse than using the changes feed with the performance hit. The times depend on the dataset and how many doc ids get passed as payload to these requests, but I'm afraid that the increased time when using _all_docs is serious enough to disqualify it as a viable option.

So our only alternative is to update the changes_doc_ids_optimization_threshold config to some significantly large value - we kinda limit the number of maximum docs we handle in a single purge request to ~20.000, so for safety I bumped it to 30.000 and keep current performance. This means that no code changes are required, except for adding changes_doc_ids_optimization_threshold as a couch config value.