Open jon-chuang opened 1 year ago
interesting, like a rust specific version of this?
It would also be interesting if we could use this to store prompts and their outputs in the database and press the up key to re-use previous prompts or their outputs but we wouldn't even need a vector database for this specifically, we could just do that with a typical SQL database.
We would also need to re-visit this https://github.com/rustformers/llama-rs/issues/56
interesting, like a rust specific version of this?
Yes, there are many options available but they mainly offer the same type of indexes.
re-use previous prompts or their outputs
The problem with a hash table or KV store is that natural language queries are rarely exactly the same, especially if you are not averaging over the human population but just running local.
Milvus has already promoted a similarity search-based "cacheing" as one of its applications (repo)
I think this is out of scope for this repository specifically. I could see a batteries-included implementation being built atop llama-rs
, but it's unlikely to feature an implementation of a vector database itself because our focus is specifically on robust, fast inference of LLMs.
our focus is specifically on robust, fast inference of LLMs.
Yes, but I think the broader focus is "low-resource, low-dependency embedded LLM toolchain".
I can definitely see the sliced out k-ANN code existing in a separate repo (perhaps under this org) and compiled in as an optional dependency to llama-rs and available in the cli (on crates.io it would be cargo install llama-rs --features "knowledge-base"
)
"low-resource, low-dependency embedded LLM toolchain".
Literally what I was thinking of yesterday
I've made an issue here to sound out the idea: https://github.com/ggerganov/llama.cpp/issues/930
I can definitely see the sliced out k-ANN code existing in a separate repo (perhaps under this org) and compiled in as an optional dependency to llama-rs and available in the cli (on crates.io it would be
cargo install llama-rs --features "knowledge-base"
)
Sure, but I don't see why it would have to be part of llama-rs
specifically. The CLI is really just a demo application for the library; it doesn't aspire to higher functionality than that.
I'm not opposed to having this kind of functionality - having a full-stack solution for using a LLM to do knowledge base inference would be great - but I think it's a hard sell to make it part of this crate specifically. By analogy, we're like hyper
, not reqwest
- we're not trying to solve all the problems, just the core problem that enables other people to solve their problems.
but I think it's a hard sell to make it part of this crate specifically.
I'm in agreement here. But do you think that rustformers
org more generally could be expanded to this broader scope of a low-resource LLM toolchain and host the broader-scoped llama-rs-toolchain
?
Sorry - meant to get back to you earlier. Yeah, I think having this as part of a larger solution would be great. I've created this repository to track issues that aren't directly related to llama-rs
, but are for the ecosystem around it.
Has anyone experimented with this? Are there any estimates on how much work it would be?
I’ve not experimented, but it’s on my (currently very long) todo list. I estimate it could be a week of work to get the code in place, but it may take some additional experimentation with prompting (e.g. to emit sequence of tokens indicating search action) to get the models to work well with the knowledge base.
I’ll hopefully get to it once I’m back from holiday.
Any updates on this? @jon-chuang
@jon-chuang @hhamud & @philpax i've taken a dig at porting chroma to rust: https://gist.github.com/itsbalamurali/118e7ce18f1519f26780b9845dee4e87 has the basic structure to it.
@jon-chuang @hhamud & @philpax i've taken a dig at porting chroma to rust: https://gist.github.com/itsbalamurali/118e7ce18f1519f26780b9845dee4e87 has the basic structure to it.
Nice, do you have an actual full repo to share rather than just a gist?
I am interested in implementing a rust knowledge base for llms
Cozo might be useful. I'm totally out-of-the-loop, so it might not work for what you're looking for. I figured I'd share just in case.
I implemented an in memory version of this as part of Floneum. Here is the relivent code: https://github.com/floneum/floneum/blob/master/plugin/src/vector_db.rs
Instant distance is fairly easy to work with and actively maintained
Cozo might be useful. I'm totally out-of-the-loop, so it might not work for what you're looking for. I figured I'd share just in case.
Thanks cozo is very interesting, and might solve the use case I was thinking of.
I saw https://github.com/tensorchord/pgvecto.rs today - it fits the bill of “rust only”. (Admittedly I am too new to this field to even fully understand if this is relevant or not, but in case someone might find it useful)
I think the next step in the project is lightweight ANN (approx. k-nearest neighbours search) vector database. Applications:
Details:
Problem definition:
Options:
Plan: