aws-samples / aws-panorama-samples

This repository has samples that demonstrate various aspects of AWS Panorama device and the Panorama SDK
https://docs.aws.amazon.com/panorama/
MIT No Attribution
82 stars 58 forks source link

PT37_opengpu deployment error in yolov5s_pt37_app_node #80

Closed vinodjsr closed 1 year ago

vinodjsr commented 1 year ago

Hi,

I have been trying to deploy the PT37_opengpu sample (through PT37_opengpu/pytorch_example.ipynb) but get the error:

An error occurred (ValidationException) when calling the CreateApplicationInstance operation: {"reason":"No registered package version found for accountId:{account_id}, packageName:test_rtsp_camera_lab3, packageVersion:2.0","message":"The input fails to satisfy the constraints specified by an AWS service.","fields":[]}

The cloudwatch logs show a warning though:

2023-02-14 11:53:09.240 WARN loadApplicationGraph(338) Unable to open node state override file: /data/cloud/graphs/applicationInstance-atbspxydjsr6mcfmulyowfleiq/nodeDesiredStateOverrides.json

2023-02-14 11:53:09.240 WARN loadApplicationGraph(403) Empty node state override file /data/cloud/graphs/applicationInstance-atbspxydjsr6mcfmulyowfleiq/nodeDesiredStateOverrides.json

2023-02-14 11:53:09.240 WARN loadApplicationGraph(410) Recreating default node state override file /data/cloud/graphs/applicationInstance-atbspxydjsr6mcfmulyowfleiq/nodeDesiredStateOverrides.json

below is the tree structure

image

The final error in [yolov5s_pt37_app_node] (https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:log-groups/log-group/$252Faws$252Fpanorama$252Fdevices$252Fdevice-d3bqsk4okufrmw2nuttbjsibxq$252Fapplications$252FapplicationInstance-atbspxydjsr6mcfmulyowfleiq/log-events/yolov5s_pt37_app_node) reads

botocore.exceptions.CredentialRetrievalError: Error when retrieving credentials from container-role: Error retrieving metadata: Received non 200 response (500) from ECS metadata: "Unable to fetch credentials"

Requesting help in resolving this.

Thanks.

vinodjsr commented 1 year ago

Additional error info yolov5s_pt37_app_node :

[2023-02-14 12:48:23,453| ERROR| MainProcess]| cw_post_metric.py:146| Error when retrieving credentials from container-role: Error retrieving metadata: Received non 200 response (500) from ECS metadata: "Unable to fetch credentials"

Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/botocore/credentials.py", line 1926, in fetch_creds full_uri, headers=headers File "/usr/local/lib/python3.7/dist-packages/botocore/utils.py", line 2862, in retrieve_full_uri return self._retrieve_credentials(full_url, headers) File "/usr/local/lib/python3.7/dist-packages/botocore/utils.py", line 2899, in _retrieve_credentials full_url, headers, self.TIMEOUT_SECONDS File "/usr/local/lib/python3.7/dist-packages/botocore/utils.py", line 2924, in _get_response % (response.status_code, response_text)

shimomut commented 1 year ago

@vinodjsr

Is the package "test_rtsp_camera_lab3" camera package?

Could you check if the package version number are consistent between the manifest file and the package? From the error message, it looks you used packageVersion:2.0, if you are not sure, could you try 1.0?

vinodjsr commented 1 year ago

Closing issue as the use case is abandoned. Thanks.