FgForrest / evitaDB

evitaDB is a specialized database with an easy-to-use API for e-commerce systems. It is a low-latency NoSQL in-memory engine that handles all the complex tasks that e-commerce systems have to deal with on a daily basis. evitaDB is expected to act as a fast secondary lookup/search index used by front stores.
https://evitadb.io
Other
62 stars 7 forks source link

Fulltext support #258

Open novoj opened 1 year ago

novoj commented 1 year ago

Fulltext is one of the key requirements for the e-commerce catalogs. But is also super-hard to implement. Currently our string based search is sub-optimal and doesn't use no specialized index. We have some spike tests for using Radix tree, which provides quite good results we could start with. But the path to a full-text engine is very hard and demanding.

We should also investigate path for embedding the Lucene engine - although this would complicate our transaction handling and also the planned replication over the cluster.

Another path which could be investigated is semantic search using LLM embeddings which might cover gap in our keyword fulltext search capabilities. But this is also very hard to implement (all current implementations use HSNW index for this kind of search).

novoj commented 1 year ago

Tip for exploration: https://github.com/benldr/JPruningRadixTrie

novoj commented 1 year ago

To evaluate: https://foojay.io/today/jvector-1-0/#VectorSearch

novoj commented 11 months ago

To read: https://db.in.tum.de/~leis/papers/ART.pdf

novoj commented 5 months ago

Again: https://github.com/jbellis/jvector