medic / cht-core

The CHT Core Framework makes it faster to build responsive, offline-first digital health apps that equip health workers to provide better care in their communities. It is a central resource of the Community Health Toolkit.
https://communityhealthtoolkit.org
GNU Affero General Public License v3.0
436 stars 204 forks source link

Make the Search service return more relevant results #2257

Open SCdF opened 8 years ago

SCdF commented 8 years ago

Right now the free-text portion of the search service works querying a view that has an emit for every "token" of every document (that we care about, i.e. all Contacts). If a token is deemed to be a string it's broken up on space and emitted N times for each of those tokens.

It is done this way because we do not impose a particular schema, so there is no "white list" of fields that we should be searching.

The downside of this is that it often isn't amazing at relevance. It returns everything associated with your particular query.

This is particularly obvious that we now use the Search service in dropdowns:

image

He's there, just buried:

image

We need to work out how to make Search more relevant.

Options

Weighted tokens

One idea is to make the view "weighted". You would change the view to, for every token, emit([doc._id, token], 1);. We would then group by the first key (the document id) and reduce by the value (a 1 for each result).

Or something else?

However, I'm not convinced that weighted tokens will necessarily work for us in this case. That algorithm gives preference to documents that contain that value more, but that isn't necessarily going to lend itself to more relevance. In the above example all of those probably contain the same number of "victor" tokens.

There is currently undocumented sneaky hax in there, so you can do this:

image

I wonder if we need to think more in this direction?

SCdF commented 8 years ago

cc. @garethbowen

estellecomment commented 8 years ago

How important is it to get super-search? Maybe just indexing contact names would meet user expectations. Search for "Victor", get people and places named “XXXvictorYYY”, done.

What are user stories for search queries that are not contact names? Search in the notes, to find a person by place, because you described where they lived in the notes? Search by phone number? Searching by parent place is covered with the filters (select your branch/clinic and you get a subset). @diannakane, @abbyad, thoughts ?

I vote for implementing a fast simple search, and waiting to see if anyone complains (and/or running user studies in a couple months to discover if there's a pain point, because people might not realize they have pain points there)

And/or, add a separate search page for advanced search, that's super slow and crashes your browser if you're not on macbook pro, but hey, it's advanced!

abbyad commented 8 years ago

From what I remember, the current behaviour is by design because we wanted all related people to show up. This is also the case for the search box in the Contact tab filter bar. It would be a good time for the design team to reevaluate this.

cc @diannakane

estellecomment commented 8 years ago

We could also search for all names within your parent places and contact persons, but only name field.

SCdF commented 8 years ago

And/or, add a separate search page for advanced search, that's super slow and crashes your browser if you're not on macbook pro, but hey, it's advanced!

We're talking about the search service here, not any particular search page. Searches can be a whole page affair like the Contacts page or the Reports page, or they can be inside select boxes, or <insert thing we haven't used it for yet>.

estellecomment commented 8 years ago

We're talking about the search service here, not any particular search page

Well, make two different search services then, or different indexes for same service, or different args for same service (don't know that much about how it works sorry)

garethbowen commented 8 years ago

@SCdF and I have been talking about ways to make this awesomer. I quite like the deep indexing and advanced search feature and it's something we supported in 0.4 via lucene. I think it's great that right now you can configure contacts to have some arbitrary field that's specific to your deployment and search against it. There are two problems with this currently: performance (which is being covered by other issues), and relevance (this issue).

I vote for some investigation into weighted tokens. If we can make the freetext view order based on where the token is found (eg: name is a heavy weighting, parents name is a light weighting) then in this example Victor will show up at the top.

diannakane commented 8 years ago

We could even look at delivering the search results within labeled sections, "names with.." "parents name with." What other types of tokens could be returned?

We definitely need to understand more when search is used and what users are looking for when they use it. Not immediately low-hanging fruit, in my opinion, and should be looked at in conjunction with work on filters, which is our other means of finding information you are looking for. Adding to design investigations.

abbyad commented 1 year ago

@garethbowen is this issue still relevant, or can we close it?

garethbowen commented 1 year ago

Yes it's still relevant as the behaviour is unchanged. There doesn't seem to be much demand to improve it though...

michaelkohn commented 1 year ago

With the search bar redesign we did, we have been monitoring usage of the Search feature on the contacts and reports tabs and been paying extra attention to it during shadowing sessions... and in general usage is not high, but it's an area that is on our radar as we see users scrolling a lot (instead of using search) to find households on the contacts tab. I'm not convinced that it's lack of usage is due to it not working as expected or desired... I'd agree that it's often less effort (at least on a phone) to scroll through a few pages (50-120 households) of alphabetical content than tapping into a search bar, having the keyboard cover a bunch of content, typing out a few characters of what i'm looking for, hitting a button to execute the search and then look through the results set... or go through all that again to search with updated terms. We haven't looked closely into usage of search from other areas of the system though (admin pages, messaging, etc...).

All that to say... at least from a health worker perspective, Search isn't a heavily used feature yet and we (Care Teams) don't have any immediate plans to address the way it works.