Open atalman opened 5 months ago
As continuation of work of: https://github.com/pytorch/builder/issues/1432
Following images needs to be validated with this option ON. Only applicable to Release channel
pytorch/pytorch:2.3.0-cuda12.1-cudnn8-runtime pytorch/pytorch:2.3.0-cuda11.8-cudnn8-runtime pytorch/pytorch:2.3.0-cuda12.1-cudnn8-devel pytorch/pytorch:2.3.0-cuda11.8-cudnn8-devel
Instead if the ghcr.io images
ghcr.io/pytorch/pytorch:2.3.0-cuda12.1-cudnn8-runtime ghcr.io/pytorch/pytorch:2.3.0-cuda11.8-cudnn8-runtime ghcr.io/pytorch/pytorch:2.3.0-cuda12.1-cudnn8-devel ghcr.io/pytorch/pytorch:2.3.0-cuda11.8-cudnn8-devel
cc @juliagmt-google
[ ] Add an option to validate arm64 builds using linux.arm64.2xlarge runner only applicable to ghcr: https://github.com/pytorch/test-infra/blob/5c893b3135350b1d5ead58b2cc8bd0a44deb414a/tools/scripts/generate_binary_build_matrix.py#L79 Add an option to pass validation runner from https://github.com/pytorch/test-infra/blob/main/tools/scripts/generate_docker_release_matrix.py to the docker release validation workflow. Hence pytorch/test-infra@main/tools/scripts/generate_docker_release_matrix.py will need to be modified, validation_runner added
[x] Fix/Review logic. Arm64 builds should not contain CUDA in the name, see this comment: https://github.com/pytorch/pytorch/issues/125094#issuecomment-2083165165
To be able to detect issues similar to this: https://github.com/pytorch/pytorch/issues/125094
Docker release matrix: https://github.com/pytorch/test-infra/blob/main/.github/workflows/generate_docker_release_matrix.yml
Python script https://github.com/pytorch/test-infra/blob/main/tools/scripts/generate_docker_release_matrix.py
Validate docker image: https://github.com/pytorch/builder/blob/main/.github/workflows/validate_docker_images.yml
Landed https://github.com/pytorch/pytorch/pull/125617 to fix issue 3. And synchronized with test-infra script: https://github.com/pytorch/test-infra/pull/5184
As continuation of work of: https://github.com/pytorch/builder/issues/1432
Following images needs to be validated with this option ON. Only applicable to Release channel
Instead if the ghcr.io images
cc @juliagmt-google
[ ] Add an option to validate arm64 builds using linux.arm64.2xlarge runner only applicable to ghcr: https://github.com/pytorch/test-infra/blob/5c893b3135350b1d5ead58b2cc8bd0a44deb414a/tools/scripts/generate_binary_build_matrix.py#L79 Add an option to pass validation runner from https://github.com/pytorch/test-infra/blob/main/tools/scripts/generate_docker_release_matrix.py to the docker release validation workflow. Hence pytorch/test-infra@main/tools/scripts/generate_docker_release_matrix.py will need to be modified, validation_runner added
[x] Fix/Review logic. Arm64 builds should not contain CUDA in the name, see this comment: https://github.com/pytorch/pytorch/issues/125094#issuecomment-2083165165
To be able to detect issues similar to this: https://github.com/pytorch/pytorch/issues/125094
Docker release matrix: https://github.com/pytorch/test-infra/blob/main/.github/workflows/generate_docker_release_matrix.yml
Python script https://github.com/pytorch/test-infra/blob/main/tools/scripts/generate_docker_release_matrix.py
Validate docker image: https://github.com/pytorch/builder/blob/main/.github/workflows/validate_docker_images.yml