olivernn / lunr.js

A bit like Solr, but much smaller and not as bright
http://lunrjs.com
MIT License
8.96k stars 548 forks source link

make indexing faster (not an issue) #99

Closed palominoz closed 9 years ago

palominoz commented 10 years ago

Hello there. I am trying lunr to index ~40.000 documents. It gets 10x slower when cursor is at about 25.000. Any suggestion on how to speed up this process?

jure commented 10 years ago

Could you try profiling with (e.g. with Chrome's Dev Tools) to see what actually causes the slow down?

A short introduction: https://developer.chrome.com/devtools/docs/cpu-profiling

olivernn commented 10 years ago

As mentioned by @jure some profiling would be really helpful here. Also, how large are the documents that you are trying to index?

palominoz commented 10 years ago

Unfortunately its quite difficult since the environment is node and node-inspector has some issues in profiling memory. But, I would say the token store its getting quite big: I saw idx.tokenStore.length > 300k. One thing I didn't mention is that the process is getting steadily slower up to 10x.

Documents have 4 string attributes, 3 of them < 20 chars, the last > 20 chars.

olivernn commented 9 years ago

0.5.7 includes the changes from #124 which should provide a reasonable increase in performance when indexing.

olivernn commented 9 years ago

Closing. A number of changes have been introduced that should make indexing faster. There is always scope to make it faster still, but I would need some benchmarking/profiling to help identify the specific issues in this case. Please do re-open if you are still having problems though.