aws-samples / amazon-kendra-langchain-extensions

Samples to build Generative AI applications with LangChain and Amazon Kendra
https://aws.amazon.com/blogs/machine-learning/quickly-build-high-accuracy-generative-ai-applications-on-enterprise-data-using-amazon-kendra-langchain-and-large-language-models/
MIT No Attribution
156 stars 104 forks source link

Inference Component Name header is required #64

Open ecdedios opened 6 months ago

ecdedios commented 6 months ago

I'm getting the following error

botocore.errorfactory.ValidationError: An error occurred (ValidationError) when calling the InvokeEndpoint operation: Inference Component Name header is required for endpoints to which you plan to deploy inference components. Please include Inference Component Name header or consider using SageMaker models.

when I run python kendra_chat_llama_2.py

Name: boto3 Version: 1.34.21


llm=SagemakerEndpoint(
          endpoint_name=endpoint_name, 
          region_name=region, 
          model_kwargs={"max_new_tokens": 1500, "top_p": 0.8,"temperature":0.6},
          endpoint_kwargs={"CustomAttributes":"accept_eula=true"},
          content_handler=content_handler,

      )
MithilShah commented 6 months ago

Jumpstart inference endpoints now need an InferenceComponentName response = client.invoke_endpoint( EndpointName=endpoint_name, InferenceComponentName='jumpstart-dft-meta-textgeneration-l-xx', ContentType="application/json", Body=json.dumps(payload), ) The change needs to happen in the langchain library first. Following that up.

https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/llms/sagemaker_endpoint.py#L126

3coins commented 6 months ago

@MithilShah Did you try using the InferenceComponentName in the endpont_kwargs?

ecdedios commented 6 months ago

@3coins @MithilShah

Yes it did. Thanks! I modified kendra_chat_llama_2.py to this


  llm=SagemakerEndpoint(
          endpoint_name=endpoint_name, 
          region_name=region, 
          model_kwargs={"max_new_tokens": 1500, "top_p": 0.8,"temperature":0.6},
          endpoint_kwargs={"CustomAttributes":"accept_eula=true",
                           "InferenceComponentName":"jumpstart-dft-meta-textgeneration-l-###"},
          content_handler=content_handler,

      )
utility-aagrawal commented 5 months ago

@ecdedios I have the same issue. I am trying to understand what's the difference between endpoint_name and InferenceComponentName?

If my sagemaker endpoint is meta-textgeneration-llama-2-7b-f-20240201-XXXXXX, what is point_name and what is InferenceComponentName? Appreciate your help with this!

ecdedios commented 5 months ago

@utility-aagrawal I forgot exactly which one I'd used for inferenceComponentName but it's either the endpoint name or the model name. Here are some screenshots. Basically, you get the model name by clicking on the endpoint name.

Screenshot 2024-02-01 at 15 54 59 Screenshot 2024-02-01 at 15 55 39
MithilShah commented 5 months ago

I am working on a fix , but @ecdedios is right. The one starting with "jumpstart.." is the endpoint name and the one in the "model" section is the Inference Component name. Testing the fix, but will release soon

utility-aagrawal commented 5 months ago

Thanks @ecdedios @MithilShah ! I tried with InferenceComponentName in endpoint_kwargs and got this error:

ValueError: Error raised by inference endpoint: An error occurred (ValidationError) when calling the InvokeEndpoint operation: Inference Component Name header is not allowed for endpoints to which you dont plan to deploy inference components. Please remove the Inference Component Name header and try again.

It just worked without InferenceComponentName. It's weird because the same code wasn't working yesterday and was asking me to include InferenceComponentName. I am not sure what's changed since yesterday.

MithilShah commented 5 months ago

@utility-aagrawal can you please try again. I have added a new variable. If you deploy the endpoint via the console, it deploys the model to an InferenceComponent and you need to specify a INFERENCE_COMPONENT_NAME environment variable. However, if you deploy via the SDK you have to option of deploying directly via the endpoint without using an inferencecomponent. If you do that, just ignore the INFERENCE_COMPONENT_NAME environment variable.

utility-aagrawal commented 5 months ago

Thanks @MithilShah ! I'll try and let you know.