In order for subsequent operations to be tractable on light hardware, a caching strategy should be used. The goal is to maintain a dictionary of precomputed embeddings for each file. This way, subsequent operations can simply load the embeddings without computing them again.
This cache manager should:
remove the embeddings of the files which have been deleted in the meantime
add new dictionary entries for files which have been recently created
update dictionary entries for files which have been recently updated
Embeddings should be based on the sentence-bert module, like in MemNav. The dictionary can be stored in a pickle in a hidden local folder.
Front matter should be ignored. And perhaps also headings. Some Markdown-specific module might already be able to sort this out.
In order for subsequent operations to be tractable on light hardware, a caching strategy should be used. The goal is to maintain a dictionary of precomputed embeddings for each file. This way, subsequent operations can simply load the embeddings without computing them again.
This cache manager should:
Embeddings should be based on the
sentence-bert
module, like inMemNav
. The dictionary can be stored in a pickle in a hidden local folder.Front matter should be ignored. And perhaps also headings. Some Markdown-specific module might already be able to sort this out.