HSV-AI / presentations

This repository is used to manage the presentations given at Huntsville AI meetups. It provides a collection of Issues, Cards, and Files to plan and create the content needed for a presentation.
16 stars 6 forks source link

5/01/2024 Llama-cpp-python #97

Closed jperiodlangley closed 1 week ago

jperiodlangley commented 1 month ago

Description

Working on this now. Looking at how to incorporate llama-cpp-python into the RAG application for SpaceApps. It provides a docker image with server conforming to OpenAI API, so should be easy to connect with and use.

https://github.com/abetlen/llama-cpp-python

Complete the following items to get a presentation ready for Huntsville AI

Adding material to the presentations repository

Add the file to present (prefer Jupyter Notebooks or Markdown formated files) to the folder structure. For multiple files, create a directory following the naming convention and add the files to it.

Naming convention

We use a convention of starting the filenames with a date (year/month/day) so that the files are still sorted by date even when in alphabetical format.

YYMMDD_Session_Description.extension

jperiodlangley commented 3 weeks ago

Outline provided by perplexity.ai

I'm giving a talk this week about llama-cpp-python. What outline should I use?

Here is a suggested outline for your talk on llama-cpp-python:

Introduction

Installation

Loading the Model

Using the API

Performance Considerations

Examples and Use Cases

Conclusion

The outline covers the key aspects of llama-cpp-python - installation, model loading, API usage, performance tuning, and real-world examples. Feel free to adjust the depth based on your target audience and available time. The search results provide good supporting information for each section.

Citations: [1] Llama.cpp - Outlines 〰️ https://outlines-dev.github.io/outlines/reference/models/llamacpp/ [2] Python bindings for llama.cpp - GitHub https://github.com/abetlen/llama-cpp-python [3] How to Run LLaMA Locally on CPU or GPU - YouTube https://www.youtube.com/watch?v=SvjWDX2NqiM [4] Poor performance - How do I use LlamCpp in python correctly ... https://github.com/ggerganov/llama.cpp/discussions/6322 [5] Poll: Why do you choose llama.cpp over vLLM or vice-versa? - Reddit https://www.reddit.com/r/LocalLLaMA/comments/1bub23v/poll_why_do_you_choose_llamacpp_over_vllm_or/

jperiodlangley commented 3 weeks ago

llama-cpp-python

jperiodlangley commented 3 weeks ago

Finally got llama-cpp-python to use my GPU after re-installing with:

CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir

image

jperiodlangley commented 2 weeks ago

https://llama-cpp-python.readthedocs.io/en/stable/