aws-samples / large-model-workshop-financial-services

This code repository contains code assets of "Generative AI Large Language Model Workshop for Financial Services" workshop
https://catalog.us-east-1.prod.workshops.aws/workshops/c8e0f5d8-0658-4345-8b1d-cc637cbdd671
MIT No Attribution
18 stars 6 forks source link

Problem with lab3/2_financial_news_summarization/t5_python_backend.ipynb #1

Closed changux closed 1 year ago

changux commented 1 year ago

Hi,

All the steps on "Notebook 1 - Financial News Sentiment Analysis" ran well.

Notebook 2 shows some problems:

  1. Command:
!aws s3 cp s3://ee-assets-prod-us-east-1/modules/05fa7598d4d44836a42fde79b26568b2/v3/mme_env.tar.gz triton-serve-py/t5-summarization/

Returns a fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden message. Looks like mme_env.tar.gz is not there.

Following the note regarding the MME model is being included in the lab3/2_financial_news_summarization/triton-serve-py) /t5-summarization/ folder, I continued the process until the instruction:

response = runtime_sm_client.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType='application/octet-stream',
    Body=request_body,
    TargetModel=python_model_file_name
)

When I get the following error:

---------------------------------------------------------------------------
ModelError                                Traceback (most recent call last)
/tmp/ipykernel_26725/2992517296.py in <module>
      3     ContentType='application/octet-stream',
      4     Body=request_body,
----> 5     TargetModel=python_model_file_name
      6 )

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    528                 )
    529             # The "self" in this scope is referring to the BaseClient.
--> 530             return self._make_api_call(operation_name, kwargs)
    531 
    532         _api_call.__name__ = str(py_operation_name)

~/anaconda3/envs/amazonei_pytorch_latest_p37/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    962             error_code = parsed_response.get("Error", {}).get("Code")
    963             error_class = self.exceptions.from_code(error_code)
--> 964             raise error_class(parsed_response, operation_name)
    965         else:
    966             return parsed_response

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{"error":"load failed for model 'e5f78d779f1daxxxxxxxxxx': version 1 is at UNAVAILABLE state: Internal: Failed to get the canonical path for /opt/ml/models/e5f78ded28a6779f1da087c1b08c2316/model/t5-summarization/mme_env.tar.gz.;\n"}". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/financial-usecase-mme-ep-2023-06-02-03-34-43 in account XXXXXXXXXX for more information.

I double checked and the file: /opt/ml/models/e5f78ded28a6779f1da087c1b08c2316/model/t5-summarization/mme_env.tar.gz doesn't exist. Where should it be? How can I package the folder contents to reproduce the file?

Best,

changux commented 1 year ago

Fixed.

Roles created in the lab1 must include the S3FullAccess permission.