Switch to llama-simple-chat

ericcurtin commented 1 week ago

This is a new chat program in llama.cpp which is much simpler than the existing one we were using. It doesn't have the debug/verbose output problem and just seems higher quality in general for a simple chatbot, it's a few 100 lines of code.

Summary by Sourcery

Switch to using llama-simple-chat for a more streamlined and higher quality chat experience, removing verbose options and updating container configurations.

New Features:

Introduce llama-simple-chat as the new chat program, replacing the previous implementation.

Enhancements:

Simplify the chat execution process by removing unnecessary arguments and options.

Build:

Update the LLAMA_CPP_SHA in the container images to a new commit hash.

sourcery-ai[bot] commented 1 week ago

Reviewer's Guide by Sourcery

This PR switches the chat implementation from llama-cli to llama-simple-chat, a simpler and more focused chat program from llama.cpp. The change involves updating the command-line arguments and upgrading the llama.cpp version in container images.

Sequence diagram for switching to llama-simple-chat

sequenceDiagram
    actor User
    participant Model as Model
    participant LlamaSimpleChat as llama-simple-chat

    User->>Model: Run chat
    Model->>LlamaSimpleChat: Execute with common_params
    LlamaSimpleChat-->>Model: Processed chat response
    Model-->>User: Return chat response

Updated class diagram for model execution

classDiagram
    class Model {
        -exec_model_path
        -exec_args
        +run(args)
    }

    Model : +run(args)
    Model : -exec_model_path
    Model : -exec_args

    note for Model "Updated to use llama-simple-chat with common_params"

File-Level Changes

Change	Details	Files
Switch chat implementation from llama-cli to llama-simple-chat	Remove llama-cli specific arguments like --in-prefix, --in-suffix, and --no-display-prompt Replace command with simpler llama-simple-chat executable Remove conversation mode (-cnv) flag handling	`ramalama/model.py`
Update llama.cpp version in container images	Update LLAMA_CPP_SHA from 1329c0a to af148c9	`container-images/cuda/Containerfile` `container-images/ramalama/Containerfile`

Tips and commands

#### Interacting with Sourcery - **Trigger a new review:** Comment `@sourcery-ai review` on the pull request. - **Continue discussions:** Reply directly to Sourcery's review comments. - **Generate a GitHub issue from a review comment:** Ask Sourcery to create an issue from a review comment by replying to it. - **Generate a pull request title:** Write `@sourcery-ai` anywhere in the pull request title to generate a title at any time. - **Generate a pull request summary:** Write `@sourcery-ai summary` anywhere in the pull request body to generate a PR summary at any time. You can also use this command to specify where the summary should be inserted. #### Customizing Your Experience Access your [dashboard](https://app.sourcery.ai) to: - Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others. - Change the review language. - Add, remove or edit custom review instructions. - Adjust other review settings. #### Getting Help - [Contact our support team](mailto:support@sourcery.ai) for questions or feedback. - Visit our [documentation](https://docs.sourcery.ai) for detailed guides and information. - Keep in touch with the Sourcery team by following us on [X/Twitter](https://x.com/SourceryAI), [LinkedIn](https://www.linkedin.com/company/sourcery-ai/) or [GitHub](https://github.com/sourcery-ai).

ericcurtin commented 1 week ago

This recently added llama-simple-chat program is more suited to our needs. Maybe we can actively start contributing to it, I already started:

https://github.com/ggerganov/llama.cpp/pull/10291

ericcurtin commented 1 week ago

@cooktheryan @lsm5 @mrunalp @slp @rhatdan @tarilabs @umohnani8 @ygalblum PTAL

rhatdan commented 1 week ago

LGTM

containers / ramalama