Closed ericcurtin closed 1 week ago
This PR switches the chat implementation from llama-cli
to llama-simple-chat
, a simpler and more focused chat program from llama.cpp. The change involves updating the command-line arguments and upgrading the llama.cpp version in container images.
sequenceDiagram
actor User
participant Model as Model
participant LlamaSimpleChat as llama-simple-chat
User->>Model: Run chat
Model->>LlamaSimpleChat: Execute with common_params
LlamaSimpleChat-->>Model: Processed chat response
Model-->>User: Return chat response
classDiagram
class Model {
-exec_model_path
-exec_args
+run(args)
}
Model : +run(args)
Model : -exec_model_path
Model : -exec_args
note for Model "Updated to use llama-simple-chat with common_params"
Change | Details | Files |
---|---|---|
Switch chat implementation from llama-cli to llama-simple-chat |
|
ramalama/model.py |
Update llama.cpp version in container images |
|
container-images/cuda/Containerfile container-images/ramalama/Containerfile |
This recently added llama-simple-chat program is more suited to our needs. Maybe we can actively start contributing to it, I already started:
@cooktheryan @lsm5 @mrunalp @slp @rhatdan @tarilabs @umohnani8 @ygalblum PTAL
LGTM
This is a new chat program in llama.cpp which is much simpler than the existing one we were using. It doesn't have the debug/verbose output problem and just seems higher quality in general for a simple chatbot, it's a few 100 lines of code.
Summary by Sourcery
Switch to using llama-simple-chat for a more streamlined and higher quality chat experience, removing verbose options and updating container configurations.
New Features:
Enhancements:
Build: