Cost? - Githubissues

Hi Winton!

For this sample, cost mainly comes from from:

Generating embeddings for the documents and the user queries using an embedding model.
Inference cost for the LLM itself.

For 1, in this sample we use the Titan Embeddings model through Amazon Bedrock to generate the embeddings with a cost of $0.0001 per 1,000 input tokens (so per roughly 200 words). You can look at a sample document of yours to make an estimate.

For 2, in this sample we use Anthropic Claude 2 to generate a LLM responses which is priced at $0.01102 per 1k input tokens and $0.03268 per 1k output tokens. So in this case, the cost depends on the length of the user question and the length of the response generated by the LLM. You can think of a typical set of questions and answers that you expect in your application and use these for a rough estimate.

You can find the on-demand model pricing for all models in Bedrock here: https://aws.amazon.com/bedrock/pricing/ (discounted pricing is available via Provisioned Throughput)

There are other infrastructure costs involved as well but this should be the two main cost factors.

Anthropic just announced an improve Claude version 2.1 which also comes with reduced pricing. I expect this to be available through Amazon Bedrock soon as well.

Also, take a look at this more comprehensive sample from us that offers more options in terms of model choice, vector databases, and more: https://github.com/aws-samples/aws-genai-llm-chatbot

Let me know if this answers your question.

aws-samples / serverless-pdf-chat

Cost? #27