feature: add embedding vector database

tiddly-gittly / tiddlywiki-plugins

Plugins developed for a better TiddlyGit.

https://tiddly-gittly.github.io/tiddlywiki-plugins/

MIT License

26 stars 6 forks source link

feature: add embedding vector database #98

Open linonetwo opened 1 year ago

linonetwo commented 1 year ago

Description 描述

So we can use it as LLM memory or speedup searching.

Additional Context 额外上下文

No response

linonetwo commented 1 year ago

weaviate is not ok, it only supports linux https://weaviate.io/developers/weaviate/installation/embedded

linonetwo commented 1 year ago

Milvus Lite can be used in python https://milvus.io/docs/milvus_lite.md#Configure-Milvus-Lite

linonetwo commented 1 year ago

Maybe https://github.com/asg017/sqlite-vss is the best choice for embedding vector database

linonetwo commented 1 year ago

https://github.com/belladoreai/llama-tokenizer-js/issues/5#issuecomment-1592971359 Need to use things like Word2Vec

linonetwo commented 1 year ago

Description

So searching and auto-generation works better. This can be download on demand, and can be reused when tidgi updated. So the model file can be exist on user/document folder?

Additional Context

When chat about srs in qq group, I think it is better not using user's brain for memorizing.

https://github.com/tiddly-gittly/TidGi-Desktop/issues/334

linonetwo commented 1 year ago

https://github.com/Embedding/Chinese-Word-Vectors

linonetwo commented 1 year ago

Use llama-node embedding https://llama-node.vercel.app/docs/backends/llama.cpp/embeddings

linonetwo commented 7 months ago

应该作为一个插件，提供 indexer

linonetwo commented 7 months ago

语义图片搜索，如果嵌入了 sqlite-vss 向量数据库，可以参考 https://github.com/EdVince/CLIP-ImageSearch-NCNN 对图片都生成向量缓存起来，方便直接搜图片内容