run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.21k stars 5.16k forks source link

[Bug]: llamaindex is unable to connect to Sagemaker endoint using the latest llama-index and llama-index-embeddings-sagemaker-endpoint #12101

Open Adeel-UNSW opened 7 months ago

Adeel-UNSW commented 7 months ago

Bug Description

I am using Llama-index version 0.10.20 and llama-index-embeddings-sagemaker-endpoint version 0.1.3. My code was working on AWS sagemaker notebook since yesterday using the 0.10.18 version of llama-index. Please see the code below

!pip install llama-index \
llama-index-embeddings-sagemaker-endpoint

from llama_index.embeddings.sagemaker_endpoint import SageMakerEmbedding

ENDPOINT = 'llm-embedder-2024-03-19-23-14-16-413'

embed_model = SageMakerEmbedding(
    endpoint_name=ENDPOINT
)

embeddings = embed_model.get_text_embedding(
    "Working on a RAG model"
)
len(embeddings)

I am getting below error

**--> 235 dispatcher.event(EmbeddingEndEvent(chunks=[text], embeddings=[text_embedding])) 236 return text_embedding

File /opt/conda/lib/python3.10/site-packages/pydantic/v1/main.py:341, in BaseModel.init(pydantic_self__, **data) 339 values, fields_set, validation_error = validate_model(pydantic_self.class, data) 340 if validation_error: --> 341 raise validation_error 342 try: 343 object_setattr(__pydantic_self, 'dict', values)

ValidationError: 1 validation error for EmbeddingEndEvent embeddings -> 0 -> 0 value is not a valid float (type=type_error.float)**

Please have a look at the relevant logs when I just execute the following lines of code


embed_model = SageMakerEmbedding(
    endpoint_name=ENDPOINT
)

Please let me know if you have any questions

Version

0.10.20

Steps to Reproduce

!pip install llama-index \
llama-index-embeddings-sagemaker-endpoint

from llama_index.embeddings.sagemaker_endpoint import SageMakerEmbedding

ENDPOINT = 'llm-embedder-2024-03-19-23-14-16-413'

embed_model = SageMakerEmbedding(
    endpoint_name=ENDPOINT
)

embeddings = embed_model.get_text_embedding(
    "Working on a RAG model"
)
len(embeddings)

Relevant Logs/Tracbacks

DEBUG:botocore.hooks:Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane
Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane
DEBUG:botocore.hooks:Changing event name from before-call.apigateway to before-call.api-gateway
Changing event name from before-call.apigateway to before-call.api-gateway
DEBUG:botocore.hooks:Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict
Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict
DEBUG:botocore.hooks:Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration
Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration
DEBUG:botocore.hooks:Changing event name from before-parameter-build.route53 to before-parameter-build.route-53
Changing event name from before-parameter-build.route53 to before-parameter-build.route-53
DEBUG:botocore.hooks:Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search
Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search
DEBUG:botocore.hooks:Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section
Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section
DEBUG:botocore.hooks:Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask
Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask
DEBUG:botocore.hooks:Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section
Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section
DEBUG:botocore.hooks:Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search
Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search
DEBUG:botocore.hooks:Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section
Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section
DEBUG:botocore.utils:IMDS ENDPOINT: http://169.254.169.254/
IMDS ENDPOINT: http://169.254.169.254/
DEBUG:botocore.credentials:Looking for credentials via: env
Looking for credentials via: env
DEBUG:botocore.credentials:Looking for credentials via: assume-role
Looking for credentials via: assume-role
DEBUG:botocore.credentials:Looking for credentials via: assume-role-with-web-identity
Looking for credentials via: assume-role-with-web-identity
DEBUG:botocore.credentials:Looking for credentials via: sso
Looking for credentials via: sso
DEBUG:botocore.credentials:Looking for credentials via: shared-credentials-file
Looking for credentials via: shared-credentials-file
DEBUG:botocore.credentials:Looking for credentials via: custom-process
Looking for credentials via: custom-process
DEBUG:botocore.credentials:Looking for credentials via: config-file
Looking for credentials via: config-file
DEBUG:botocore.credentials:Looking for credentials via: ec2-credentials-file
Looking for credentials via: ec2-credentials-file
DEBUG:botocore.credentials:Looking for credentials via: boto-config
Looking for credentials via: boto-config
DEBUG:botocore.credentials:Looking for credentials via: container-role
Looking for credentials via: container-role
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): 169.254.170.2:80
Starting new HTTP connection (1): 169.254.170.2:80
DEBUG:urllib3.connectionpool:http://169.254.170.2:80 "GET /_sagemaker-instance-credentials/97ebc606c2e1d9d24f36d0a9ad9db7fbf1c77d8d437ca86341fbcedf5eca9852 HTTP/1.1" 200 1136
http://169.254.170.2:80 "GET /_sagemaker-instance-credentials/97ebc606c2e1d9d24f36d0a9ad9db7fbf1c77d8d437ca86341fbcedf5eca9852 HTTP/1.1" 200 1136
DEBUG:botocore.loaders:Loading JSON file: /opt/conda/lib/python3.10/site-packages/botocore/data/endpoints.json
Loading JSON file: /opt/conda/lib/python3.10/site-packages/botocore/data/endpoints.json
DEBUG:botocore.loaders:Loading JSON file: /opt/conda/lib/python3.10/site-packages/botocore/data/sdk-default-configuration.json
Loading JSON file: /opt/conda/lib/python3.10/site-packages/botocore/data/sdk-default-configuration.json
DEBUG:botocore.hooks:Event choose-service-name: calling handler <function handle_service_name_alias at 0x7f7a849d4ee0>
Event choose-service-name: calling handler <function handle_service_name_alias at 0x7f7a849d4ee0>
DEBUG:botocore.loaders:Loading JSON file: /opt/conda/lib/python3.10/site-packages/botocore/data/sagemaker-runtime/2017-05-13/service-2.json
Loading JSON file: /opt/conda/lib/python3.10/site-packages/botocore/data/sagemaker-runtime/2017-05-13/service-2.json
DEBUG:botocore.loaders:Loading JSON file: /opt/conda/lib/python3.10/site-packages/botocore/data/sagemaker-runtime/2017-05-13/endpoint-rule-set-1.json
Loading JSON file: /opt/conda/lib/python3.10/site-packages/botocore/data/sagemaker-runtime/2017-05-13/endpoint-rule-set-1.json
DEBUG:botocore.loaders:Loading JSON file: /opt/conda/lib/python3.10/site-packages/botocore/data/partitions.json
Loading JSON file: /opt/conda/lib/python3.10/site-packages/botocore/data/partitions.json
DEBUG:botocore.hooks:Event creating-client-class.sagemaker-runtime: calling handler <function add_generate_presigned_url at 0x7f7a84b4ee60>
Event creating-client-class.sagemaker-runtime: calling handler <function add_generate_presigned_url at 0x7f7a84b4ee60>
DEBUG:botocore.configprovider:Looking for endpoint for sagemaker-runtime via: environment_service
Looking for endpoint for sagemaker-runtime via: environment_service
DEBUG:botocore.configprovider:Looking for endpoint for sagemaker-runtime via: environment_global
Looking for endpoint for sagemaker-runtime via: environment_global
DEBUG:botocore.configprovider:Looking for endpoint for sagemaker-runtime via: config_service
Looking for endpoint for sagemaker-runtime via: config_service
DEBUG:botocore.configprovider:Looking for endpoint for sagemaker-runtime via: config_global
Looking for endpoint for sagemaker-runtime via: config_global
DEBUG:botocore.configprovider:No configured endpoint found.
No configured endpoint found.
DEBUG:botocore.endpoint:Setting runtime.sagemaker timeout as (60.0, 60)
Setting runtime.sagemaker timeout as (60.0, 60)
dosubot[bot] commented 7 months ago

Hey @Adeel-UNSW, nice to see you back around these parts! Hope you've been well since our last chat.

Thank you for providing detailed information about the issue you're facing. Based on the error message you're encountering, it seems like there's a problem with the data type of the embeddings returned by the SageMaker endpoint. The error message indicates that the value is not a valid float, which suggests that the EmbeddingEndEvent is expecting a float value but is receiving a different data type.

The EmbeddingEndEvent is a Pydantic model, which is used for data validation. Pydantic models enforce type checking, so if the data type of the input does not match the expected data type, a ValidationError is raised. In this case, the EmbeddingEndEvent is expecting a list of lists of floats for the embeddings field, but it seems to be receiving a different data type.

The SageMakerEmbedding class uses the deserialize_output method of the content_handler object to process the response from the SageMaker endpoint. The deserialize_output method is responsible for converting the response into the expected format. If the deserialize_output method is not correctly processing the response, it could be causing the ValidationError.

Without the specific implementation details of the deserialize_output method, it's difficult to determine the exact cause of the issue. However, you could try printing the response from the SageMaker endpoint before it's processed by the deserialize_output method to see if it's in the expected format. You could add the following lines of code before the deserialize_output method is called:

response = self.sagemaker_runtime_client.invoke_endpoint(
    EndpointName=self.endpoint_name,
    ContentType='application/json',
    Body=json.dumps(payload),
    Accept='application/json'
)
print(response['Body'].read().decode())

This will print the raw response from the SageMaker endpoint, which might help identify if the issue is with the response format.

If the response format is correct, the issue might be with the deserialize_output method. You could try printing the output of the deserialize_output method to see if it's correctly processing the response:

output = self.content_handler.deserialize_output(response=response)
print(output)

This will print the output of the deserialize_output method, which might help identify if the issue is with the deserialization logic.

If you're still encountering the issue after trying these solutions, it might be helpful to provide more information about your environment, such as the exact version of the botocore and boto3 packages you're using, and any other relevant details about your setup. This could help identify if there's a specific compatibility issue or other underlying problem causing the connection error.

I hope this helps! If you have any other questions or need further clarification, please don't hesitate to ask.

Source Code

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
logan-markewich commented 7 months ago

Seems like pagemaker isn't returning proper types, it should be a List[float] -- would have to update the integration class

Adeel-UNSW commented 7 months ago

@logan-markewich were you able to replicate this issue ? Who needs to update the integration class ?

logan-markewich commented 7 months ago

I don't have access to test sagemaker. I could make a guess at a fix.

It's an open source library, so anyone can make a change :)