Closed chunhualiao closed 10 months ago
I have a fix:
diff --git a/talk_codebase/llm.py b/talk_codebase/llm.py
index 9a26c4a..cb3b462 100644
--- a/talk_codebase/llm.py
+++ b/talk_codebase/llm.py
@@ -94,6 +94,7 @@ class LocalLLM(BaseLLM):
model_n_batch = int(self.config.get("n_batch"))
callbacks = CallbackManager([StreamStdOut()])
llm = LlamaCpp(model_path=model_path, n_ctx=model_n_ctx, n_batch=model_n_batch, callbacks=callbacks, verbose=False)
+ llm.client.verbose = False
return llm
create please pr https://github.com/rsaryev/talk-codebase/pulls
Please update talk-codebase pip install --upgrade talk-codebase==0.1.46
Model selected is Mini Orca (Small) | orca-mini-3b.ggmlv3.q4_0.bin | 1928446208 | 3 billion | q4_0 | OpenLLaMa
As you can see, the timing information is injected before the last word of the answer "range."