aws-samples / aws-genai-llm-chatbot

A modular and comprehensive solution to deploy a Multi-LLM and Multi-RAG powered chatbot (Amazon Bedrock, Anthropic, HuggingFace, OpenAI, Meta, AI21, Cohere, Mistral) using AWS CDK on AWS
https://aws-samples.github.io/aws-genai-llm-chatbot/
MIT No Attribution
1.1k stars 332 forks source link

feat: Disable Sagemaker endpoint (or cross-encoder per workspace) #588

Closed charles-marion closed 1 month ago

charles-marion commented 1 month ago

Issue #, if available: https://github.com/aws-samples/aws-genai-llm-chatbot/issues/222

Description of changes: This change is a follow up of https://github.com/aws-samples/aws-genai-llm-chatbot/pull/286 by @azaylamba

Currently cross encoder models are used to rank the search results but the models available need to be hosted on Sagemaker which increases cost significantly. Having an option to disable cross encoder models would be helpful while exploring the chatbot so that Sagemaker costs can be avoided.

Changes:

Thank you @azaylamba for your help with this change.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.