exoascension / vault-chat

A ChatGPT bot trained on your vault notes. Ask your AI questions about your own thoughts and ideas!
GNU General Public License v3.0
113 stars 7 forks source link

Block vectors #19

Closed kristenbrann closed 1 year ago

kristenbrann commented 1 year ago

Embeddings now at the file AND block level

This updates the database to store embeddings on files AND embeddings on chunks of text within the files. Each database entry now looks like this:

{
    md5hash: string;
    embedding: Vector;
    chunks: [{
    contents: string;
    embedding: Vector;
    }]
}

Check if files have changed

Stores md5 hash on the entry and uses that to compare on open to see if existing files need to be reindexed.

Batch embeddings

In the indexing function, call the embeddings api with multiple texts at once. Saves greatly on time!

Screenshot 🤭

Screen Shot 2023-03-16 at 5 21 48 PM

Misc