medic / cht-core

The CHT Core Framework makes it faster to build responsive, offline-first digital health apps that equip health workers to provide better care in their communities. It is a central resource of the Community Health Toolkit.
https://communityhealthtoolkit.org
GNU Affero General Public License v3.0
468 stars 218 forks source link

Possibly improve freetext search by using Mango Index instead of view #8998

Open jkuester opened 7 months ago

jkuester commented 7 months ago

What feature do you want to improve? This blog post notes that Mango queries (aka _find requests) can be leveraged with pre-built indexes to provide performant full-text search.

It might be worth investigating if this is true and if it can provide better performance/functionality than our current "freetext" search approach (which is very brute-force TBH).

Describe the improvement you'd like

Currently doing freetext searches (e.g. searching for a contact by name) requires some nasty views that do not index words < 3 characters. See https://github.com/medic/cht-core/issues/8832 for the UX challenges this poses to users.

A more functional full-text search would be useful to users and perhaps Mango indexes could get us that without sacrificing much in terms of performance, complexity, or DB index/view size. The _find docs note:

Mango wraps several index types, starting with the Primary Index out-of-the-box. Mango indexes, with index type json, are built using MapReduce Views.

I expect this means that Mango indexes on the Couch server leverage Erlang views, which could be more performant. But we would need to measure the performance impacts for both the server and the offline Pouch implementation (since that will not be built on Erlang).

Additional context @ChinHairSaintClair this might be of interest to you since it touches on things we discussed in this thread!

jkuester commented 7 months ago

Another thing that just struck me is that a Mango index that is implemented with Erlang views on the server and JS views on Pouch is basically the (already implemented) abstraction layer that @nydr and I were dreaming about awhile back (where you could define your query once and have it automatically interpolated into different view structures on the client vs server)....