api-collab / api-collab-server

0 stars 2 forks source link

Search - Word Autocomplete #36

Closed kand617 closed 6 years ago

kand617 commented 6 years ago

As a user, when I type my partial words in to search box, I would like the system to provide suggestions. This will free me from remembering exact names..... Now given that I could use search operators such as valu*, the system suggesting the word would be a nicer user experience.

Note this is not a prediction system like google search. But rather a simple word autocomplete. For example: startD would give startDate, startDay, creditStartDate etc

kand617 commented 6 years ago

Possible Naive Implementation:

1) Process each document, to form global dictionary of words and store that in a table. 2) Each word becomes an row in hibernate search 3) As the user types a word, we select the partial word and return the top 5 matches complete words

The downside of this, there are a lot of rows, almost an overkill maybe. Alternatively.... we could do the following.

1) Create a global list of words, and maybe store on client side... 2) Use a javascript library to perform the search. Fuzzy search on a list of strings. e.g https://github.com/mattyork/fuzzy or http://fusejs.io/

Downside:

Plus side:

sudhirtumati commented 6 years ago

Storing the dictionary in database and querying it every time would be a overkill, I agree. At the same time, letting client manage it presents different set of issues

Does hibernate search offer any features to address this use-case? In my view, this should be explored first before any alternate solutions.

kand617 commented 6 years ago

I could be wrong, but Hibernate Search functions work based off an entity and a search on attribute(s) can only return entities as a whole. This is useful if the aim of auto complete is based on the title (not our case).

With the client side approach,

kand617 commented 6 years ago

Actually I have another solution which might be a lot cleaner.

Pros:

Cons

This library looks ideal: https://github.com/xdrop/fuzzywuzzy

FuzzySearch.extractTop("goolge", ["google", "bing", "facebook", "linkedin", "twitter", "googleplus", "bingnews", "plexoogl"], 3)
[(string: google, score: 83, index: 0), (string: googleplus, score: 63, index:5), (string: plexoogl, score: 43, index: 7)]
kand617 commented 6 years ago

@sudhirtumati you mentioned you know of a way to play around with Lucene index to. Care to suggest the alternative?

sudhirtumati commented 6 years ago

@kand617 Here you go. These might get you started

sudhirtumati commented 6 years ago

https://wiki.apache.org/lucene-java/TheBasics

kand617 commented 6 years ago

Issue resolved with PR #37 Enhancements to the quality will be done by subsequent PRs