Suggestion: Add LlamaCache support

Awesome project, I was really excited to see that you have support for the llama python package.

Small suggestion, you should consider using the LlamaCache which adds an in-memory cache that significantly reduces the amount of token re-processing. The change is minimal but requires setting an appropriate cache size in bytes (likely a user option, default is 2Gb).

from llama_cpp import Llama, LlamaCache

llm = Llama(...)
cache = LlamaCache(capacity_bytes=...)
llm.set_cache(cache)

yoheinakajima / babyagi

Suggestion: Add LlamaCache support #305