olivernn / lunr.js

A bit like Solr, but much smaller and not as bright
http://lunrjs.com
MIT License
8.94k stars 548 forks source link

how to do auto suggest? #471

Open hoogw opened 4 years ago

hoogw commented 4 years ago

211 have solution for version 1(old) on how to do auto suggest by tokenStore.expand.

However, I use newer version 2, tokenSet, no expand method,

Can you suggest how to do auto suggest by use tokenSet?

===================Originally posted by @olivernn in https://github.com/olivernn/lunr.js/issues/211#issuecomment-199231163 ======================

So, to some extent, this is possible right now. An index has a tokenStore, this is a trie of all the tokens that are in the index. It has a method, expand that will find all words with a given prefix, e.g.

idx.tokenStore.expand("foo") // returns an array of all known tokens with "foo" as a prefix

That said, there are some caveats. Specifically, the tokens in the tokenStore are the result of running the original tokens through the indexes pipeline. With a default pipeline this will mean that the tokens are stemmed, missing stop words and trimmed of what lunr thinks are redundant characters. You can see the result of this in the example. Open up the console and try the following:

idx.tokenStore.expand("check")

The data in the example index isn't the cleanest, but you can see the impact that stemming has on the tokens.

Alternatively you could create an index specifically for autocomplete, perhaps something like the following:

var autoCompleteIdx = lunr(function () {
  this.field('text')
})

With the documents structured like this:

{
  id: "word",
  text: "word",
}

Having the document ID also be the word means that the results of the search will contain the original word you inserted, but the lookup will make use of any stemming, so e.g. in this example a lookup for "words" would suggest the document with ref "word"

autoCompleteIdx.search("words") 
[{
  ref: "word",
  score: 0.123456789
}]

I think the key difficulty lunr has with this is that it doesn't really store the words in a document, it converts the document to a form that is easy to index and search, without any regard for actually getting the same words out again.

@missinglink I'd be interested in hearing some ideas that you have. I'm actually fairly far along now with some changes to the implementation of lunr, so hold off on making large changes to the current implementation. I'm hoping to get something out for people to preview in the next couple of weeks though.

Originally posted by @olivernn in https://github.com/olivernn/lunr.js/issues/211#issuecomment-199231163

hoogw commented 4 years ago

I am kind of find a solution to my own question here:

this is 2 of my working example 👍 you can turn on/off auto suggest by click the switch on top middle

1000+ records https://transparentgov.net/json2tree/esri/arcgisServerList.html

8000+ records https://transparentgov.net/json2tree/esri/hub.site.static.html?filter_by=

First, suggest is different from search

search is looking for matching record,

          _idx_results  = idx.search( your key word)  //  will give you all records that match

suggest is looking for matching token, or matching stemming word

                // idx.tokenSet, get all token(stem keywords) in this index.  intersect with suggest keyword
                 _suggest_results_exact_match =   idx.tokenSet.intersect(lunr.TokenSet.fromString(_suggest_keyword)).toArray();

suggest result will NOT give you records, instead give you array of token (stemming word)

Note: do not forget to use wildcard with keyword,

              _idx_results  =  idx.search( "your_key_word*") 

                _suggest_results =   idx.tokenSet.intersect(lunr.TokenSet.fromString("_suggest_keyword*")).toArray();

This is my complete code: for search:

image
 for suggest:
image
KulkiratSingh commented 3 years ago

Can you post your entire code which is doing auto complete feature? Where is _suggest_keyword coming from? I mean what is _suggest_keyword ?

chrisbartley commented 3 years ago

Maybe useful: https://github.com/olivernn/lunr.js/issues/287#issuecomment-454923675