Closed cfregly closed 4 months ago
While rare, the logs may not be pushed to CloudWatch under certain cluster-creation failures. This may be due to a bad GPU or other health-check error that occurs during cluster creation.
Retry the cluster creation and verify the logs are showing up in CloudWatch per the following link: https://catalog.workshops.aws/sagemaker-hyperpod/en-US/04-advanced/03-troubleshooting#logs
Contact SageMaker HyperPod support if the cluster is still not creating.
Cluster creation fails, but CloudWatch logs are empty for HyperPod.
We see a message to find error details in CloudWatch but CloudWatch does not display any logs.