do-me / SemanticFinder

SemanticFinder - frontend-only live semantic search with transformers.js
https://do-me.github.io/SemanticFinder/
MIT License
210 stars 14 forks source link

Performance Improvements #43

Closed do-me closed 8 months ago

do-me commented 10 months ago

Orama

I just found Orama, a dependency-free TS-based vector DB which could be used instead of using a simple JSON object.

I didn't find anything about performance yet so I guess we should run our own tests in case and see whether performance improvements or simplified features like import/export of data might be worth it. @VarunNSrivastava if you already have any opinions here, let me know!

Other

Besides, I noted, that we could almost double the speed of the cosine similarity function we use atm as we calculate the magnitude of the user query embedding for every iteration/comparison again instead of calculating it once, persisting it and re-using it in the function.

varunneal commented 10 months ago

That makes sense! That seems like a reasonable performance increase, though most of the compute time is of course in calculating vector embeddings.

do-me commented 8 months ago

Speed improvements

While developing https://github.com/do-me/cordis-semantic-search I noticed that search is almost instant even for 134k embeddings. On my i7 laptop, it takes like 1.5 seconds on average.

After indexing, when dealing with large document bodies, the current constant HTML updating logic is the main bottleneck. I think it'd be alright to simply update the HTML cards on the right only once when finished. Alternatively we could hardcode a fixed number and let's say update every 50k iterations or so.

I'll go ahead and test a little.

Update:

I just saw, the parameter already exists :smile:

varunneal commented 8 months ago

Makes sense to turn that parameter up

do-me commented 8 months ago

Just corrected the settings in https://github.com/do-me/SemanticFinder/commit/52989b65737f4e0bd06bdc51209df44b7237695f:

This makes a huge (!) difference for large already indexed files!

Referring to this tweet, instead of 60s it's now around 3 seconds 🚀 !

I'll run a few benchmarks and write something about it. Closing this issue as it's pretty much as fast as it gets atm.

Maybe a webassembly/Rust version could accelerate further but with runtimes of a second there is not much to gain.