HSV-AI / presentations

This repository is used to manage the presentations given at Huntsville AI meetups. It provides a collection of Issues, Cards, and Files to plan and create the content needed for a presentation.
17 stars 6 forks source link

7/10/2024 - Private RAG Deployment & Cost #100

Closed jperiodlangley closed 3 months ago

jperiodlangley commented 6 months ago

Complete the following items to get a presentation ready for Huntsville AI

Description

We have been covering a series this year based on how a RAG works, and how to build one. Now it's time to look at what it will cost to deploy it!BLUF: If you have constraints that require you to host your own LLM instead of using services such as OpenAI, Anthropic, or Google - this is going to get expensive.The baseline that we will be using to derive requirements and prices will be the a subset documents that we have been using along the way from NASA. To start, I have pulled 500 of these documents which total 109,516 paragraphs and 4,254,032 words.We will start with a basic use case with using OpenAI GPT4o and a Weaviate hosted vector store. We will move from there to a self hosted solution for all of the components and add up the month cost associated. | We have been covering a series this year based on how a RAG works, and how to build one. Now it's time to look at what it will cost to deploy it!BLUF: If you have constraints that require you to host your own LLM instead of using services such as OpenAI, Anthropic, or Google - this is going to get expensive.The baseline that we will be using to derive requirements and prices will be the a subset documents that we have been using along the way from NASA. To start, I have pulled 500 of these documents which total 109,516 paragraphs and 4,254,032 words.We will start with a basic use case with using OpenAI GPT4o and a Weaviate hosted vector store. We will move from there to a self hosted solution for all of the components and add up the month cost associated. -- | -- We have been covering a series this year based on how a RAG works, and how to build one. Now it's time to look at what it will cost to deploy it!BLUF: If you have constraints that require you to host your own LLM instead of using services such as OpenAI, Anthropic, or Google - this is going to get expensive.The baseline that we will be using to derive requirements and prices will be the a subset documents that we have been using along the way from NASA. To start, I have pulled 500 of these documents which total 109,516 paragraphs and 4,254,032 words.We will start with a basic use case with using OpenAI GPT4o and a Weaviate hosted vector store. We will move from there to a self hosted solution for all of the components and add up the month cost associated.

Tasks

Adding material to the presentations repository

Add the file to present (prefer Jupyter Notebooks or Markdown formated files) to the folder structure. For multiple files, create a directory following the naming convention and add the files to it.

Naming convention

We use a convention of starting the filenames with a date (year/month/day) so that the files are still sorted by date even when in alphabetical format.

YYMMDD_Session_Description.extension

jperiodlangley commented 4 months ago

Private RAG