langchain-ai / langchain

๐Ÿฆœ๐Ÿ”— Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.27k stars 14.74k forks source link

Issue: How to configure Amazon SageMaker endpoint #10184

Closed shenshaoyong closed 8 months ago

shenshaoyong commented 1 year ago

Issue you'd like to raise.

1. Open terminal, install bedrock specific version boto3 SDK,langchain

curl -sS https://d2eo22ngex1n9g.cloudfront.net/Documentation/SDK/bedrock-python-sdk.zip > sdk.zip sudo yum install unzip -y unzip sdk.zip -d sdk pip install --no-build-isolation --force-reinstall ./sdk/awscli--py3-none-any.whl ./sdk/boto3--py3-none-any.whl ./sdk/botocore-*-py3-none-any.whl pip install --quiet langchain==0.0.249

pip install 'jupyter-ai>=1.0,<2.0' # If you use JupyterLab 3 pip install jupyter-ai # If you use JupyterLab 4

2. change the default token count to 1024

vi ~/anaconda3/lib/python3.11/site-packages/langchain/llms/sagemaker_endpoint.py Insert below lines after body = self.content_handler.transform_input(prompt, _model_kwargs) parameters={"max_new_tokens": 1024, "top_p": 0.9, "temperature": 0.6, "return_full_text": True} t = json.loads(body) t["parameters"] = parameters body = json.dumps(t) Insert the line CustomAttributes='accept_eula=trueโ€™, between Accept=accepts, and **_endpoint_kwargs,

3. aws configure default profile, make sure the aksk has enough permissions(SageMakerFullAccess)

aws configure

4.run %%ai in *.ipynb file on ec2 instead of SageMaker notebook instance / SageMaker Studio [also can run in VSCODE] after making sure your Amazon SageMaker endpoint is health

%load_ext jupyter_ai %%ai sagemaker-endpoint:jumpstart-dft-meta-textgeneration-llama-2-7b --region-name=us-east-1 --request-schema={"inputs":""} --response-path=[0]['generation'] write somthing on Humor

Suggestion:

No response

dosubot[bot] commented 1 year ago

๐Ÿค–

Hello,

Thank you for reaching out. It seems like you're trying to configure an Amazon SageMaker endpoint using the LangChain framework. Here's how you can do it:

  1. Install the necessary packages: langchain and boto3 using pip.

    !pip3 install langchain boto3
  2. Import the necessary classes and functions.

    from typing import Dict
    from langchain import SagemakerEndpoint
    from langchain.llms.sagemaker_endpoint import LLMContentHandler
    from langchain.chains.question_answering import load_qa_chain
    import json
  3. Define the ContentHandler class which will handle the transformation of input and output for the language model.

    class ContentHandler(LLMContentHandler):
    content_type = "application/json"
    accepts = "application/json"
    
    def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:
        input_str = json.dumps({prompt: prompt, **model_kwargs})
        return input_str.encode("utf-8")
    
    def transform_output(self, output: bytes) -> str:
        response_json = json.loads(output.read().decode("utf-8"))
        return response_json[0]["generated_text"]
  4. Create an instance of SagemakerEndpoint with the necessary parameters.

    
    content_handler = ContentHandler()

chain = load_qa_chain( llm=SagemakerEndpoint( endpoint_name="endpoint-name", credentials_profile_name="credentials-profile-name", region_name="us-west-2", model_kwargs={"temperature": 1e-10}, content_handler=content_handler, ), prompt=PROMPT, )

In the `SagemakerEndpoint` call, you need to provide the following parameters:
- `endpoint_name`: The name of the endpoint from the deployed Sagemaker model. Must be unique within an AWS Region.
- `credentials_profile_name`: The name of the profile in the ~/.aws/credentials or ~/.aws/config files, which has either access keys or role information specified. If not specified, the default credential profile or, if on an EC2 instance, credentials from IMDS will be used.
- `region_name`: The AWS region where your SageMaker endpoint is deployed.
- `model_kwargs`: A dictionary of additional arguments to pass to the model.
- `content_handler`: An instance of a class that handles the transformation of input and output for the language model.

5. Use the `chain` function to process your documents and question.
```python
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

In this example, docs is a list of Document objects and query is the question you want to ask.

Please note that the SagemakerEndpoint class in the LangChain framework is designed to interact with a deployed Sagemaker model. It is used to authenticate with AWS, send requests to the Sagemaker endpoint, and handle the responses.

I hope this helps! If you have any other questions, feel free to ask.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a ๐Ÿ‘ if this is helpful and ๐Ÿ‘Ž if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

darshisen-wbd commented 1 year ago

When I create a boto3 client using credential profile, it works fine:

session = boto3.Session(profile_name='my_llm_profile')

# Create a low-level client representing Amazon SageMaker Runtime
sagemaker_runtime = session.client(
    "sagemaker-runtime", region_name='us-east-1')

# The endpoint name must be unique within 
# an AWS Region in your AWS account. 
endpoint_name='my-llm-ept'

# Gets inference from the model hosted at the specified endpoint:
response = sagemaker_runtime.invoke_endpoint(
    EndpointName=endpoint_name,
    Body=bytes('{"inputs": ["what is the meaning of life?"]}', 'utf-8')
    )

I get the expected output: {"generated_text": "\nThe meaning of life is to live a life that is meaningful to you.\nWhat is the meaning of life? What is the meaning of"}

But when I use the same profile through langchain Sagemaker Endpoint, it gives me an error:

content_handler = ContentHandler()

llm=SagemakerEndpoint(
        endpoint_name="my-llm-ept",
        credentials_profile_name="my_llm_profile",
        region_name="us-east-1",
        model_kwargs={"temperature": 1e-10},
        content_handler=content_handler,
    )

This is the error I get:

ValueError: Error raised by inference endpoint: An error occurred (AccessDeniedException) when calling the InvokeEndpoint operation: User: arn:aws:sts::<accountid>:assumed-role/AWSReservedSSO_<my role>/<my user> is not authorized to perform: sagemaker:InvokeEndpoint on resource: arn:aws:sagemaker:us-east-1:<accountid>:endpoint/my-llm-ept because no identity-based policy allows the sagemaker:InvokeEndpoint action

dosubot[bot] commented 9 months ago

Hi, @shenshaoyong! I'm Dosu, and I'm helping the LangChain team manage their backlog. I'm marking this issue as stale.

It looks like the issue is about configuring an Amazon SageMaker endpoint using the LangChain framework. You provided detailed steps for installation, configuration, and running the endpoint on an EC2 instance. Dosubot also provided a detailed response with code examples, and another user, darshisen-wbd, shared their experience with encountering an error when using the same profile through LangChain's Sagemaker Endpoint.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!