aws-samples / awsome-distributed-training

Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.
MIT No Attribution
177 stars 74 forks source link

Change from Dockerhub to Public ECR #335

Closed sean-smith closed 4 months ago

sean-smith commented 4 months ago

The following line can cause rate limits from Dockerhub:

https://github.com/aws-samples/awsome-distributed-training/blob/1d15afd847f7810125c60353a09a1757188ba7aa/1.architectures/5.sagemaker-hyperpod/LifecycleScripts/base-config/utils/install_efa_node_exporter.sh#L19

We should switch this image to pull from Public ECR.

mhuguesaws commented 4 months ago
public.ecr.aws/docker/library/ubuntu:20.04