aws-deepracer-community / deepracer-genai-workshop

Community version of AWS GenAI Workshop updated to support DRfC models
Apache License 2.0
2 stars 3 forks source link

deepracer-genai-workshop

Community version of AWS GenAI Workshop updated to support DeepRacer for Cloud / DeepRacer on the Spot and AWS console models.

This workshop runs in us-east-1 region only.

Costs

This workshop should only cost $2-5 to run, as long as you follow the cleanup steps at the end. Note - one of the LLMs used is Claude Instant, offered through the AWS Marketplace by Anthropic.

The use of Claude Instant may not be covered by AWS credits as it's an AWS Marketplace product, it's recommended if you're using AWS Credits to check if they cover marketplace. In testing this element only cost $0.10 to run the workshop if you're not covered for it with AWS credits.

Pre-requisites

Configure AWS Account, Roles and S3 via a CloudFormation script

In order to run this workshop you should have the AWS CLI installed and be authenticated to AWS, or alternatively use Cloudshell in the AWS console.

Request Service Quota

In order to run the Stable Diffussion workshop a Service Quota increase in us-east-1 for 'ml.g5.2xlarge for endpoint usage' to 1 is required.

Navigate to Service Quotas and check Sagemaker for quotas for 'ml.g5.2xlarge for endpoint usage' endpoint-quota

If the quota is set to zero click on the quota and request it is raised endpoint-quota-detail endpoint-quota-increase

You may have to wait 24 hours or so for AWS to enact the quota increase.

Configure Bedrock LLM Access

Setup Sagemaker

Lab 1 - AWS DeepRacer model evaluator using Agents

Introduction

This hands-on workshop demonstrates how to build an intelligent conversational agent using Amazon Bedrock with Anthropic Claude, a large language model (LLM), combined with the Langchain library. The agent is designed to provide insights and recommendations about AWS DeepRacer models and training.

The workshop shows how to:

Create custom Langchain tools to allow the agent to interface with the AWS DeepRacer service API. This includes listing available models, downloading model artifacts, and extracting model metadata like the training data and reward function.

Initialize a ReAct agent in Langchain and make the custom tools available to it. The agent can reason about which tools to invoke based on the user's questions.

Use prompting techniques like few-shot learning to improve the agent's reasoning capabilities with just a few examples.

Handle errors gracefully if the agent's responses don't match the expected format.

Leverage the custom tools to enable the agent to provide insights about an AWS DeepRacer model's training data, hyperparameters, reward function and more.

By the end of the hands-on workshop, attendees will be able to build conversational agents using LLMs that can integrate with AWS services via custom interfaces. The key takeaways are being able to extend an agent's capabilities using tools, architecting a modular agent, and applying prompting techniques to improve reasoning.

Architecture

In this lab you will build the following solution, using a Amazon SageMaker studio notebook.

model_evaluator_architecture

Langchain

LangChain is a framework for building context-aware applications powered by language models, enabling reasoning and response generation based on provided context and instructions. It provides modular components, off-the-shelf chains, and a developer platform for simplified application development, testing, and deployment.

Agents

Agents are AI systems built around large language models (LLMs) as their core engine to enable capabilities beyond just text generation. Agents combine the natural language strengths of LLMs with additional components like planning, memory, and tool use. Planning allows agents to break down complex goals into manageable subtasks. Memory provides short-term, in-context learning and long-term knowledge storage for fast retrieval. Tool use enables agents to gather information and take actions by calling APIs, leveraging search engines, executing code, and more.

Carefully engineered prompts shape agent behavior by encoding personas, instructions, permissions, and context. This allows developers to customize agents for diverse applications like conversational assistants, workflow automation, simulations, and scientific discovery. Key benefits of LLM agents include natural language understanding, reasoning, and self-directed task completion. However, challenges remain around limited context size, unreliable natural language interfaces, and difficulties with long-term planning.

Overall, agents represent an exciting advancement in building AI systems that can collaborate with humans in natural language. Leveraging the strengths of LLMs, agents exhibit reasoning, learning, and autonomous capabilities.

Tools

Langchain provides a powerful framework for building conversational agents using large language models (LLMs). One of the key capabilities it enables is the use of custom tools that expand what the LLM can do. Tools allow the LLM agent to interface with external functions, services, and other machine learning models. This massively increases the range of possible capabilities for the agent.

Tools are Python classes that take in text input, perform processing, and return text output. They act as functions the LLM can call. The agent decides which tool to use based on the tool's description. Descriptions are written in natural language so the LLM can understand when a tool is needed.

Tools make it possible to have the LLM leverage other expert models tuned for specific tasks. The LLM acts as the controller, delegating to tools as needed. This is similar to how humans leverage tools and experts to expand our own capabilities.

The key benefits of tools are enabling abilities the LLM does not inherently have and integrating external data sources and functions. The main challenges are properly describing tools so the LLM uses them correctly and managing which tools are available. Overall, custom tools are a powerful way to create extremely versatile LLM agents

Reasoning and Act (ReAct)

The agent used in this lab uses a technique called ReAct. ReAct (Reasoning and Acting) is a new paradigm that combines advances in reasoning and acting capabilities of language models to enable them to solve complex language reasoning and decision making tasks. With ReAct, language models can generate reasoning traces to create, maintain, and adjust high-level plans as well as take actions to incorporate additional information from external sources like APIs and knowledge bases.

The key benefit of ReAct is the synergy between reasoning and acting. Reasoning allows the model to induce and update plans while actions enable gathering additional information to support reasoning. This helps address issues like hallucination and error cascading in reasoning-only approaches. ReAct has been shown to achieve superior performance on tasks like multi-hop question answering, fact checking, and interactive decision making compared to reasoning-only and acting-only baselines.

However, ReAct does have some challenges. Non-informative actions can derail reasoning, so retrieving truly informative knowledge is critical. There is also a need for large scale human annotation or interventions to correct reasoning hallucinations and errors. Overall, ReAct demonstrates the promise of combining reasoning and acting in language models for robust and human-like task solving.

Getting started with the lab

Lab2 - Modify AWS DeepRacer track images using Stable Diffusion and analyze learning pattern

Introduction

In this workshop, you will learn how to leverage generative AI models like Stable Diffusion from stability.ai to enhance simulated training data for reinforcement learning. We will cover deploying Stable Diffusion models on Amazon SageMaker , using the models to modify simulated AWS DeepRacer track images and add real-world elements, analyzing how improvements to simulated data impact model predictions, and prompt engineering for controlled image generation.

You will gain hands-on experience packaging multiple Stable Diffusion models together, deploying them to a Amazon SageMaker endpoint, querying the endpoint to generate enhanced images, and visualizing how a pre-trained AWS DeepRacer model responds to the improved simulated data.

deepracer_track_source_to_heat_map

We will deploy multiple variations of Stable Diffusion on a single Amazon SageMaker Multi-Model GPU Endpoint (MME GPU) powered by NVIDIA Triton Inference Server.

This workshop uses references from the Sagemaker Examples notebook and the DeepRacer Log Analysis notebook

The SageMaker Multi-Model Endpoint used requires a minimum of ml.g5.2xlarge to host the Stable Diffusion models required for this lab

Architecture

lab2_arch.png

Deep Learning containers

AWS Deep Learning Containers are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and Apache MXNet (Incubating). Deep Learning Containers provide optimized environments with TensorFlow and MXNet, Nvidia CUDA (for GPU instances), and Intel MKL (for CPU instances) libraries and are available in the Amazon Elastic Container Registry (Amazon ECR). Amazon SageMaker enables customers to deploy a model using custom code with NVIDIA Triton Inference Server. This functionality is available through the development of Triton Inference Server Containers.

Image to image diffusion models

In machine learning, diffusion models are a class of generative models that simulate the data generation process. They transform a simple starting distribution into a desired complex data distribution. Some of the Stable Diffusion models like sd_depth, sd_upscale and sd_inpaint can be applied for image-to-image generation by passing a text prompt and a source image. These prompts will help us to precisely condition the output modified images that we want to create.

Getting started with the lab

Summary

This hands-on workshop provided you with the opportunity to expand your skills in conversational AI and simulated data enhancement using Amazon's cloud services.

In the first lab, you built a custom conversational agent powered by Anthropic's Claude and integrated it with AWS DeepRacer using the Langchain library.

Topics covered:

Topics covered:

Important - Clean Up

To avoid unnecessary costs if you are using your own account, we recommend running the following clean up procedure.