Closed shenshaoyong closed 8 months ago
๐ค
Hello,
Thank you for reaching out. It seems like you're trying to configure an Amazon SageMaker endpoint using the LangChain framework. Here's how you can do it:
Install the necessary packages: langchain
and boto3
using pip.
!pip3 install langchain boto3
Import the necessary classes and functions.
from typing import Dict
from langchain import SagemakerEndpoint
from langchain.llms.sagemaker_endpoint import LLMContentHandler
from langchain.chains.question_answering import load_qa_chain
import json
Define the ContentHandler
class which will handle the transformation of input and output for the language model.
class ContentHandler(LLMContentHandler):
content_type = "application/json"
accepts = "application/json"
def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:
input_str = json.dumps({prompt: prompt, **model_kwargs})
return input_str.encode("utf-8")
def transform_output(self, output: bytes) -> str:
response_json = json.loads(output.read().decode("utf-8"))
return response_json[0]["generated_text"]
Create an instance of SagemakerEndpoint
with the necessary parameters.
content_handler = ContentHandler()
chain = load_qa_chain( llm=SagemakerEndpoint( endpoint_name="endpoint-name", credentials_profile_name="credentials-profile-name", region_name="us-west-2", model_kwargs={"temperature": 1e-10}, content_handler=content_handler, ), prompt=PROMPT, )
In the `SagemakerEndpoint` call, you need to provide the following parameters:
- `endpoint_name`: The name of the endpoint from the deployed Sagemaker model. Must be unique within an AWS Region.
- `credentials_profile_name`: The name of the profile in the ~/.aws/credentials or ~/.aws/config files, which has either access keys or role information specified. If not specified, the default credential profile or, if on an EC2 instance, credentials from IMDS will be used.
- `region_name`: The AWS region where your SageMaker endpoint is deployed.
- `model_kwargs`: A dictionary of additional arguments to pass to the model.
- `content_handler`: An instance of a class that handles the transformation of input and output for the language model.
5. Use the `chain` function to process your documents and question.
```python
chain({"input_documents": docs, "question": query}, return_only_outputs=True)
In this example, docs
is a list of Document
objects and query
is the question you want to ask.
Please note that the SagemakerEndpoint
class in the LangChain framework is designed to interact with a deployed Sagemaker model. It is used to authenticate with AWS, send requests to the Sagemaker endpoint, and handle the responses.
I hope this helps! If you have any other questions, feel free to ask.
Best, Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a ๐ if this is helpful and ๐ if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
When I create a boto3 client using credential profile, it works fine:
session = boto3.Session(profile_name='my_llm_profile')
# Create a low-level client representing Amazon SageMaker Runtime
sagemaker_runtime = session.client(
"sagemaker-runtime", region_name='us-east-1')
# The endpoint name must be unique within
# an AWS Region in your AWS account.
endpoint_name='my-llm-ept'
# Gets inference from the model hosted at the specified endpoint:
response = sagemaker_runtime.invoke_endpoint(
EndpointName=endpoint_name,
Body=bytes('{"inputs": ["what is the meaning of life?"]}', 'utf-8')
)
I get the expected output:
{"generated_text": "\nThe meaning of life is to live a life that is meaningful to you.\nWhat is the meaning of life? What is the meaning of"}
But when I use the same profile through langchain Sagemaker Endpoint, it gives me an error:
content_handler = ContentHandler()
llm=SagemakerEndpoint(
endpoint_name="my-llm-ept",
credentials_profile_name="my_llm_profile",
region_name="us-east-1",
model_kwargs={"temperature": 1e-10},
content_handler=content_handler,
)
This is the error I get:
ValueError: Error raised by inference endpoint: An error occurred (AccessDeniedException) when calling the InvokeEndpoint operation: User: arn:aws:sts::<accountid>:assumed-role/AWSReservedSSO_<my role>/<my user> is not authorized to perform: sagemaker:InvokeEndpoint on resource: arn:aws:sagemaker:us-east-1:<accountid>:endpoint/my-llm-ept because no identity-based policy allows the sagemaker:InvokeEndpoint action
Hi, @shenshaoyong! I'm Dosu, and I'm helping the LangChain team manage their backlog. I'm marking this issue as stale.
It looks like the issue is about configuring an Amazon SageMaker endpoint using the LangChain framework. You provided detailed steps for installation, configuration, and running the endpoint on an EC2 instance. Dosubot also provided a detailed response with code examples, and another user, darshisen-wbd, shared their experience with encountering an error when using the same profile through LangChain's Sagemaker Endpoint.
Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!
Issue you'd like to raise.
1. Open terminal, install bedrock specific version boto3 SDK,langchain
curl -sS https://d2eo22ngex1n9g.cloudfront.net/Documentation/SDK/bedrock-python-sdk.zip > sdk.zip sudo yum install unzip -y unzip sdk.zip -d sdk pip install --no-build-isolation --force-reinstall ./sdk/awscli--py3-none-any.whl ./sdk/boto3--py3-none-any.whl ./sdk/botocore-*-py3-none-any.whl pip install --quiet langchain==0.0.249
pip install 'jupyter-ai>=1.0,<2.0' # If you use JupyterLab 3pip install jupyter-ai # If you use JupyterLab 4
2. change the default token count to 1024
vi ~/anaconda3/lib/python3.11/site-packages/langchain/llms/sagemaker_endpoint.py Insert below lines after body = self.content_handler.transform_input(prompt, _model_kwargs) parameters={"max_new_tokens": 1024, "top_p": 0.9, "temperature": 0.6, "return_full_text": True} t = json.loads(body) t["parameters"] = parameters body = json.dumps(t) Insert the line CustomAttributes='accept_eula=trueโ, between Accept=accepts, and **_endpoint_kwargs,
3. aws configure default profile, make sure the aksk has enough permissions(SageMakerFullAccess)
aws configure
4.run %%ai in *.ipynb file on ec2 instead of SageMaker notebook instance / SageMaker Studio [also can run in VSCODE] after making sure your Amazon SageMaker endpoint is health
%load_ext jupyter_ai %%ai sagemaker-endpoint:jumpstart-dft-meta-textgeneration-llama-2-7b --region-name=us-east-1 --request-schema={"inputs":""} --response-path=[0]['generation']
write somthing on Humor
Suggestion:
No response