aws-samples / serverless-pdf-chat

LLM-powered document chat using Amazon Bedrock and AWS Serverless
https://aws.amazon.com/blogs/compute/building-a-serverless-document-chat-with-aws-lambda-and-amazon-bedrock/
MIT No Attribution
228 stars 206 forks source link

Output / response cropped #30

Closed bernmaetz closed 9 months ago

bernmaetz commented 9 months ago

Hi, large outputs get cropped - sometimes after 7xx characters or after 1xxx characters - seems to be no hardcoded value. How can I control and extend the lenghts of the output?

pbv0 commented 9 months ago

Hi, Anthropic Claude models in Bedrock have a default value of 200 set for max_tokens_to_sample, so the output limit you are seeing is expected. You can adapt the API calls to Amazon Bedrock in the the generate_response function and set max_tokens_to_sample to up to 4096 tokens.

See here for the docs on Anttropic Claude in Bedrock: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-claude.html

And here for an example on how to change the default parameters in Python: https://docs.aws.amazon.com/bedrock/latest/userguide/api-methods-run-inference.html

bernmaetz commented 9 months ago

Hi, thanks for the quick answer

Is this the right place to change?

res = qa({"question": human_input})

res = qa({"question": human_input, "max_tokens_to_sample": 4096})

Output is still cropped

pbv0 commented 9 months ago

Try changing this:

embeddings, llm = BedrockEmbeddings(
        model_id="amazon.titan-embed-text-v1",
        client=bedrock_runtime,
        region_name="us-east-1",
    ), Bedrock(
        model_id="anthropic.claude-v2", client=bedrock_runtime, region_name="us-east-1"
    )

to this:

embeddings, llm = BedrockEmbeddings(
        model_id="amazon.titan-embed-text-v1",
        client=bedrock_runtime,
        region_name="us-east-1",
    ), Bedrock(
        model_id="anthropic.claude-v2",
        client=bedrock_runtime,
        region_name="us-east-1"
        model_kwargs={"max_tokens_to_sample": 4096}, # new
    )
bernmaetz commented 9 months ago

Hi, this works now - thanks for your help