NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
12.93k stars 3.12k forks source link

[DLRM/PyTorch] repository name (library/image-machine-DGX-A100) must be lowercase #1379

Open fuhailin opened 4 months ago

fuhailin commented 4 months ago

Related to Model/Framework(s) In the DeepLearningExamples/PyTorch/Recommendation/DLRM folder

Describe the bug A clear and concise description of what the bug is. I want to follow the tutorial to generate the Criteo dataset, when I build the docker image using the Dockerfile_preprocessing file, I got the error.

To Reproduce Steps to reproduce the behavior:

  1. git clone https://github.com/NVIDIA/DeepLearningExamples.git
  2. cd DeepLearningExamples/PyTorch/Recommendation/DLRM
  3. docker build -t nvidia_dlrm_preprocessing -f Dockerfile_preprocessing . --build-arg DGX_VERSION=DGX-A100

Expected behavior A clear and concise description of what you expected to happen.

# docker build -t nvidia_dlrm_preprocessing -f Dockerfile_preprocessing . --build-arg DGX_VERSION=DGX-A100
[+] Building 0.1s (1/1) FINISHED                                                                                                                                                                                                                                                                               docker:default
 => [internal] load build definition from Dockerfile_preprocessing                                                                                                                                                                                                                                                       0.0s
 => => transferring dockerfile: 3.07kB                                                                                                                                                                                                                                                                                   0.0s
Dockerfile_preprocessing:75
--------------------
  73 |     ENV NUMBER_OF_GPUS 8
  74 |
  75 | >>> FROM image-machine-${DGX_VERSION} AS final
  76 |     RUN echo "spark.worker.resource.gpu.amount    ${NUMBER_OF_GPUS}" >> /opt/spark/conf/spark-defaults.conf
  77 |
--------------------
ERROR: failed to solve: failed to parse stage name "image-machine-DGX-A100": invalid reference format: repository name (library/image-machine-DGX-A100) must be lowercase

Environment Please provide at least:

Server: Docker Engine - Community Engine: Version: 25.0.0 API version: 1.44 (minimum version 1.24) Go version: go1.21.6 Git commit: 615dfdf Built: Thu Jan 18 17:10:01 2024 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.27 GitCommit: a1496014c916f9e62104b33d1bb5bd03b0858e59 nvidia: Version: 1.1.11 GitCommit: v1.1.11-0-g4bccb38 docker-init: Version: 0.19.0 GitCommit: de40ad0