EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.
https://www.eleuther.ai
MIT License
6.25k stars 1.65k forks source link

Bedrock Model API #1486

Open austinmw opened 6 months ago

austinmw commented 6 months ago

Hi, does this library support the Amazon Bedrock API (for example, using Claude through Bedrock)?

haileyschoelkopf commented 5 months ago

Hi, we don't currently support the Amazon Bedrock API, though we'd gladly accept a PR adding an LM class for this.

Are there any sample code snippets invoking the bedrock API? in an ideal world, if the Bedrock API supported an OpenAI-like API, we could run it through the existing local-completions model type to minimize overheads in maintenance but this may not be possible.

austinmw commented 5 months ago

Hi @haileyschoelkopf thanks for your response. The Bedrock API can be called through the bedrock-runtime client using boto3 SDK for Python. Here's an example:

import boto3

bedrock_runtime_client = boto3.client(service_name="bedrock-runtime")

enclosed_prompt = "Human: " + prompt + "\n\nAssistant:"

body = {
    "prompt": enclosed_prompt,
    "max_tokens_to_sample": 200,
    "temperature": 0.5,
    "stop_sequences": ["\n\nHuman:"],
}

response = bedrock_runtime_client.invoke_model(
    modelId="anthropic.claude-v2:1", body=json.dumps(body)
)

response_body = json.loads(response["body"].read())
completion = response_body["completion"]

Docs: https://docs.aws.amazon.com/code-library/latest/ug/python_3_bedrock-runtime_code_examples.html

haileyschoelkopf commented 5 months ago

Thanks @austinmw ! Would you have any bandwidth / interest in contributing this?