Reusing previously generated index file

tesla-cat commented 4 years ago

Description

I intend to use Fuse with Google firebase

in the doc:

const myIndex = Fuse.createIndex(options.keys, books) const myFuse = new Fuse(books, options, myIndex)

we need to recollect all books and regenerate the index whenever a new book is added.

Describe the solution you'd like

is it possible to:

loads the old index file,
add one new book to it,
and then regenerate the new index file ?

and in this line const myFuse = new Fuse(books, options, myIndex) why do we need all books as an argument? why not use some information like an ID that is already accessible from myIndex ? as how they do it in Lunr.js

Thank you !

krisk commented 4 years ago

Feature already exists:

Whenever you add/remove an item via the functions add or removeAt, it will automatically also update the index. You can always get the newly generated index file via fuse.getIndex().

tesla-cat commented 4 years ago

Feature already exists:

Whenever you add/remove an item via the functions add or removeAt, it will automatically also update the index. You can always get the newly generated index file via fuse.getIndex().

interesting ! you do realize there is no add or removeAt in the documentation right ? 😂

tesla-cat commented 4 years ago

and also flexsearch.js gave this table, any rebut from fuse.js ?

krisk commented 4 years ago

Would need to dig in on how the comparison is made. They’re also using an older version of fuse, which has since gone through several performance improvements.

krisk commented 4 years ago

@tesla-cat, I took a look at the performance comparison they're running.

I'm not seeing a fuzzy-search (with actual typos) performance check in there (does the library support it? see https://github.com/nextapps-de/flexsearch/issues/118). This performance test is always against an exact match against the following queries:

var text_queries = "gulliver;great;country;time;people;little;master;took;feet;houyhnhnms".split(";");

Notably, it seems like flexsearch.js is pre-generating a dictionary of all the words in the list, with the key being the word, and value the location where it appears. For exact-string search, this will always be an O(1) operation (i.e, map[<exact_word>]), and thus always faster than what Fuse.js does, which is fuzzy-matching.

In the test, as soon as I introduce a typo, for example:

var text_queries = "guliver" // one L

flexsearch.js returns 0 results, and still shows 500k+ op/s, while Fuse.js returns actual results. So, from the looks of it, I'm not sure whether this is an adequate comparison to make.

tesla-cat commented 4 years ago

@krisk

hey thank you for teaching me this kind of fun stuff, that table looked absurd to me at the very beginning, 1000 times better than everyone else? that must be either turing's work or an ignorant joke.

great work from you !

tesla-cat commented 4 years ago

@krisk

and also by the way, you have the coolest github profile photo i have seen so far !

exogenesys commented 4 years ago

@krisk @tesla-cat

Can't I search with just the index I've previously created? Is it necessary to get all the books every-time I'm trying to use an old index? It seems odd and wasteful.

krisk / Fuse

Reusing previously generated index file #436

Description

Describe the solution you'd like