issues
search
truefoundry
/
cognita
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
https://cognita.truefoundry.com
Apache License 2.0
3.32k
stars
274
forks
source link
Feat/optimize model gateway
#398
Closed
mnvsk97
closed
2 weeks ago
mnvsk97
commented
2 weeks ago
Avoid creating a model instance for every request and instead cache by model name, config, and other metadata.
Add simple local
dict
based cache for embedding, llm, reranker, and audio models.
Always check in cache before creating an instance of a model to support 1.
Add
cachetools
library to support simple caching mechanisms and also for complex cases in the future.
Add documentation for each method in the file
dict
based cache for embedding, llm, reranker, and audio models.cachetools
library to support simple caching mechanisms and also for complex cases in the future.