Amazon bedrock plugin for llm cli

361: gorilla-llm/gorilla-cli: LLMs for your CLI

### Details

Similarity score: 0.87 - [ ] [gorilla-llm/gorilla-cli: LLMs for your CLI](https://github.com/gorilla-llm/gorilla-cli) Gorilla CLI Gorilla CLI powers your command-line interactions with a user-centric tool. Simply state your objective, and Gorilla CLI will generate potential commands for execution. Gorilla today supports ~1500 APIs, including Kubernetes, AWS, GCP, Azure, GitHub, Conda, Curl, Sed, and many more. No more recalling intricate CLI arguments! 🦍 Developed by UC Berkeley as a research prototype, Gorilla-CLI prioritizes user control and confidentiality: Commands are executed solely with your explicit approval. While we utilize queries and error logs (stderr) for model enhancement, we NEVER collect output data (stdout). #### Suggested labels #### { "key": "llm-evaluation", "value": "Evaluating the performance and behavior of Large Language Models through human-written evaluation sets" } { "key": "llm-serving-optimisations", "value": "Tips, tricks and tools to speed up the inference of Large Language Models" }

396: astra-assistants-api: A backend implementation of the OpenAI beta Assistants API

### Details

Similarity score: 0.87 - [ ] [datastax/astra-assistants-api: A backend implementation of the OpenAI beta Assistants API](https://github.com/datastax/astra-assistants-api) Astra Assistant API Service ============================= A drop-in compatible service for the OpenAI beta Assistants API with support for persistent threads, files, assistants, messages, retrieval, function calling and more using AstraDB (DataStax's db as a service offering powered by Apache Cassandra and jvector). Compatible with existing OpenAI apps via the OpenAI SDKs by changing a single line of code. Getting Started --------------- 1. **Create an Astra DB Vector database** 2. Replace the following code: ```python client = OpenAI( api_key=OPENAI_API_KEY, ) ``` with: ```python client = OpenAI( base_url="https://open-assistant-ai.astra.datastax.com/v1", api_key=OPENAI_API_KEY, default_headers={ "astra-api-token": ASTRA_DB_APPLICATION_TOKEN, } ) ``` Or, if you have an existing astra db, you can pass your db\_id in a second header: ```python client = OpenAI( base_url="https://open-assistant-ai.astra.datastax.com/v1", api_key=OPENAI_API_KEY, default_headers={ "astra-api-token": ASTRA_DB_APPLICATION_TOKEN, "astra-db-id": ASTRA_DB_ID } ) ``` 3. **Create an assistant** ```python assistant = client.beta.assistants.create( instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.", model="gpt-4-1106-preview", tools=[{"type": "retrieval"}] ) ``` By default, the service uses AstraDB as the database/vector store and OpenAI for embeddings and chat completion. Third party LLM Support ----------------------- We now support many third party models for both embeddings and completion thanks to litellm. Pass the api key of your service using `api-key` and `embedding-model` headers. For AWS Bedrock, you can pass additional custom headers: ```python client = OpenAI( base_url="https://open-assistant-ai.astra.datastax.com/v1", api_key="NONE", default_headers={ "astra-api-token": ASTRA_DB_APPLICATION_TOKEN, "embedding-model": "amazon.titan-embed-text-v1", "LLM-PARAM-aws-access-key-id": BEDROCK_AWS_ACCESS_KEY_ID, "LLM-PARAM-aws-secret-access-key": BEDROCK_AWS_SECRET_ACCESS_KEY, "LLM-PARAM-aws-region-name": BEDROCK_AWS_REGION, } ) ``` and again, specify the custom model for the assistant. ```python assistant = client.beta.assistants.create( name="Math Tutor", instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.", model="meta.llama2-13b-chat-v1", ) ``` Additional examples including third party LLMs (bedrock, cohere, perplexity, etc.) can be found under `examples`. To run the examples using poetry: 1. Create a `.env` file in this directory with your secrets. 2. Run: ```shell poetry install poetry run python examples/completion/basic.py poetry run python examples/retreival/basic.py poetry run python examples/function-calling/basic.py ``` ### Coverage See our coverage report [here](your-coverage-report-link). ### Roadmap - Support for other embedding models and LLMs - Function calling - Pluggable RAG strategies - Streaming support #### Suggested labels #### { "key": "llm-function-calling", "value": "Integration of function calling with Large Language Models (LLMs)" }

183: litellm: Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

### Details

Similarity score: 0.87 - [ ] [BerriAI/litellm: Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)](https://github.com/BerriAI/litellm) Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

678: chroma/README.md at main · chroma-core/chroma

### Details

Similarity score: 0.86 - [ ] [chroma/README.md at main · chroma-core/chroma](https://github.com/chroma-core/chroma/blob/main/README.md?plain=1) # chroma/README.md at main · chroma-core/chroma

Chroma - the open-source embedding database.
The fastest way to build Python or JavaScript LLM apps with memory!

| | Docs | Homepage

```bash pip install chromadb # python client # for javascript, npm install chromadb! # for client-server mode, chroma run --path /chroma_db_path ``` The core API is only 4 functions (run our [💡 Google Colab](https://colab.research.google.com/drive/1QEzFyqnoFxq7LUGyP1vzR4iLt9PpCDXv?usp=sharing) or [Replit template](https://replit.com/@swyx/BasicChromaStarter?v=1)): ```python import chromadb # setup Chroma in-memory, for easy prototyping. Can add persistence easily! client = chromadb.Client() # Create collection. get_collection, get_or_create_collection, delete_collection also available! collection = client.create_collection("all-my-documents") # Add docs to the collection. Can also update and delete. Row-based API coming soon! collection.add( documents=["This is document1", "This is document2"], # we handle tokenization, embedding, and indexing automatically. You can skip that and add your own embeddings as well metadatas=[{"source": "notion"}, {"source": "google-docs"}], # filter on these! ids=["doc1", "doc2"], # unique for each doc ) # Query/search 2 most similar results. You can also .get by id results = collection.query( query_texts=["This is a query document"], n_results=2, # where={"metadata_field": "is_equal_to_this"}, # optional filter # where_document={"$contains":"search_string"} # optional filter ) ``` ## Features - __Simple__: Fully-typed, fully-tested, fully-documented == happiness - __Integrations__: [`🦜️🔗 LangChain`](https://blog.langchain.dev/langchain-chroma/) (python and js), [`🦙 LlamaIndex`](https://twitter.com/atroyn/status/1628557389762007040) and more soon - __Dev, Test, Prod__: the same API that runs in your python notebook, scales to your cluster - __Feature-rich__: Queries, filtering, density estimation and more - __Free & Open Source__: Apache 2.0 Licensed ## Use case: ChatGPT for ______ For example, the `"Chat your data"` use case: 1. Add documents to your database. You can pass in your own embeddings, embedding function, or let Chroma embed them for you. 2. Query relevant documents with natural language. 3. Compose documents into the context window of an LLM like `GPT3` for additional summarization or analysis. ## Embeddings? What are embeddings? - [Read the guide from OpenAI](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings) - __Literal__: Embedding something turns it from image/text/audio into a list of numbers. 🖼️ or 📄 => `[1.2, 2.1, ....]`. This process makes documents "understandable" to a machine learning model. - __By analogy__: An embedding represents the essence of a document. This enables documents and queries with the same essence to be "near" each other and therefore easy to find. - __Technical__: An embedding is the latent-space position of a document at a layer of a deep neural network. For models trained specifically to embed data, this is the last layer. - __A small example__: If you search your photos for "famous bridge in San Francisco". By embedding this query and comparing it to the embeddings of your photos and their metadata - it should return photos of the Golden Gate Bridge. Embeddings databases (also known as **vector databases**) store embeddings and allow you to search by nearest neighbors rather than by substrings like a traditional database. By default, Chroma uses [Sentence Transformers](https://docs.trychroma.com/embeddings#sentence-transformers) to embed for you but you can also use OpenAI embeddings, Cohere (multilingual) embeddings, or your own. [View on GitHub](https://github.com/chroma-core/chroma/blob/main/README.md?plain=1) #### Suggested labels ####

62: Simonw's llm cli: Template usage.

### Details

Similarity score: 0.85 Here are the code blocks extracted from the readme file: ```bash llm 'Summarize this: $input' --save summarize ``` ```bash llm --system 'Summarize this' --save summarize ``` ```bash llm --system 'Summarize this' --model gpt-4 --save summarize ``` ```bash llm --system 'Summarize this text in the voice of $voice' \ --model gpt-4 -p voice GlaDOS --save summarize ``` ```bash curl -s https://example.com/ | llm -t summarize ``` ```bash curl -s https://llm.datasette.io/en/latest/ | \ llm -t summarize -m gpt-3.5-turbo-16k ``` ```bash llm templates ``` ```bash llm templates edit summarize ``` ```yaml prompt: 'Summarize this: $input' ``` ```yaml prompt: > Summarize the following text. Insert frequent satirical steampunk-themed illustrative anecdotes. Really go wild with that. Text to summarize: $input ``` ```bash curl -s 'https://til.simonwillison.net/macos/imovie-slides-and-audio' | \ strip-tags -m | llm -t steampunk -m 4 ``` ```yaml system: Summarize this ``` ```yaml system: You speak like an excitable Victorian adventurer prompt: 'Summarize this: $input' ``` ```yaml prompt: | Suggest a recipe using ingredients: $ingredients It should be based on cuisine from this country: $country ``` ```bash llm -t recipe -p ingredients 'sausages, milk' -p country Germany ``` ```yaml system: Summarize this text in the voice of $voice ``` ```bash curl -s 'https://til.simonwillison.net/macos/imovie-slides-and-audio' | \ strip-tags -m | llm -t summarize -p voice GlaDOS ``` ```yaml system: Summarize this text in the voice of $voice defaults: voice: GlaDOS ``` ```yaml model: gpt-4 system: roast the user at every possible opportunity, be succinct ``` ```bash llm -t roast 'How are you today?' ```

328: llama-cpp-python: OpenAI compatible web server - Local Copilot replacement - Function Calling support - Vision API support

### Details

Similarity score: 0.85 > **Python Bindings for llama.cpp** > > Simple Python bindings for @ggerganov's llama.cpp library. This package provides: > > - Low-level access to C API via ctypes interface. > - High-level Python API for text completion > - OpenAI-like API > - LangChain compatibility > - OpenAI compatible web server > - Local Copilot replacement > - Function Calling support > - Vision API support > - Multiple Models > > Documentation is available at [https://llama-cpp-python.readthedocs.io/en/latest](https://llama-cpp-python.readthedocs.io/en/latest). > > **Installation** > > llama-cpp-python can be installed directly from PyPI as a source distribution by running: > > ``` > pip install llama-cpp-python > ``` > > This will build llama.cpp from source using cmake and your system's c compiler (required) and install the library alongside this python package. > > If you run into issues during installation add the `--verbose` flag to the `pip install` command to see the full cmake build log. > > **Installation with Specific Hardware Acceleration (BLAS, CUDA, Metal, etc)** > > The default `pip install` behaviour is to build llama.cpp for CPU only on Linux and Windows and use Metal on MacOS. > > llama.cpp supports a number of hardware acceleration backends depending including OpenBLAS, cuBLAS, CLBlast, HIPBLAS, and Metal. See the llama.cpp README for a full list of supported backends. > > All of these backends are supported by llama-cpp-python and can be enabled by setting the `CMAKE_ARGS` environment variable before installing. > > On Linux and Mac you set the `CMAKE_ARGS` like this: > > ``` > CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python > ``` > > On Windows you can set the `CMAKE_ARGS` like this: > > ``` > $env:CMAKE_ARGS = "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" > pip install llama-cpp-python > ``` > > **OpenBLAS** > > To install with OpenBLAS, set the `LLAMA_BLAS` and `LLAMA_BLAS_VENDOR` environment variables before installing: > > ``` > CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python > ``` > > **cuBLAS** > > To install with cuBLAS, set the `LLAMA_CUBLAS=1` environment variable before installing: > > ``` > CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python > ``` > > **Metal** > > To install with Metal (MPS), set the `LLAMA_METAL=on` environment variable before installing: > > ``` > CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python > ``` > > #### Suggested labels > > { "key": "llm-python-bindings", "value": "Python bindings for llama.cpp library" }

irthomasthomas / undecidability