Closed atheo89 closed 3 weeks ago
Opened a fix PR on OCP CI to resolve the naming for rocm runtimes on 2024a build branch. https://github.com/openshift/release/pull/55441 Once this get merged I can proceed to fill the correct image hashes to this one.
This PR is ready for review
@atheo89: The following tests failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/amd-runtimes-ubi9-e2e-tests | 3cf53fc7839c407b1d5e42f068b0ab3b0a9e1019 | link | true | /test amd-runtimes-ubi9-e2e-tests |
ci/prow/notebook-rocm-ubi9-python-3-9-pr-image-mirror | 3cf53fc7839c407b1d5e42f068b0ab3b0a9e1019 | link | true | /test notebook-rocm-ubi9-python-3-9-pr-image-mirror |
ci/prow/runtime-rocm-pytorch-ubi9-python-3-9-pr-image-mirror | 3cf53fc7839c407b1d5e42f068b0ab3b0a9e1019 | link | true | /test runtime-rocm-pytorch-ubi9-python-3-9-pr-image-mirror |
ci/prow/runtime-rocm-tensorflow-ubi9-python-3-9-pr-image-mirror | 3cf53fc7839c407b1d5e42f068b0ab3b0a9e1019 | link | true | /test runtime-rocm-tensorflow-ubi9-python-3-9-pr-image-mirror |
ci/prow/rocm-runtimes-ubi9-e2e-tests | 3cf53fc7839c407b1d5e42f068b0ab3b0a9e1019 | link | true | /test rocm-runtimes-ubi9-e2e-tests |
ci/prow/runtimes-ubi8-e2e-tests | 3cf53fc7839c407b1d5e42f068b0ab3b0a9e1019 | link | true | /test runtimes-ubi8-e2e-tests |
ci/prow/runtimes-ubi9-e2e-tests | 3cf53fc7839c407b1d5e42f068b0ab3b0a9e1019 | link | true | /test runtimes-ubi9-e2e-tests |
Full PR test history. Your PR dashboard.
Habana seems to be having problems
Installing collected packages: typing-extensions, triton, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, lightning-utilities, nvidia-cusolver-cu12, lightning-habana, torch, torchmetrics, pytorch-lightning, lightning
Attempting uninstall: typing-extensions
Found existing installation: typing_extensions 4.5.0
Uninstalling typing_extensions-4.5.0:
Successfully uninstalled typing_extensions-4.5.0
Attempting uninstall: torch
Found existing installation: torch 2.1.0a0+gitf8b6084
Uninstalling torch-2.1.0a0+gitf8b6084:
Successfully uninstalled torch-2.1.0a0+gitf8b6084
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-cpu 2.12.1 requires tensorboard<2.13,>=2.12, but you have tensorboard 2.11.2 which is incompatible.
tensorflow-cpu 2.12.1 requires typing-extensions<4.6.0,>=3.6.6, but you have typing-extensions 4.12.2 which is incompatible.
kfp 2.7.0 requires protobuf<5,>=4.21.1, but you have protobuf 3.20.3 which is incompatible.
kfp-kubernetes 1.2.0 requires protobuf<5,>=4.21.1, but you have protobuf 3.20.3 which is incompatible.
I know that's not caused by this change, but it's a problem nonetheless https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/opendatahub-io_notebooks/628/pull-ci-opendatahub-io-notebooks-main-images/1822987134265462784
/lgtm
the images as displayed on quay.io look to be the correct ones
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: harshad16
The full list of commands accepted by this bot can be found here.
The pull request process is described here
/override ci/prow/images /override ci/prow/notebooks-ubi9-e2e-tests /override ci/prow/rocm-notebooks-e2e-tests
@harshad16: Overrode contexts on behalf of harshad16: ci/prow/images, ci/prow/notebooks-ubi9-e2e-tests, ci/prow/rocm-notebooks-e2e-tests
Related to: https://issues.redhat.com/browse/RHOAIENG-9680 Depends on: https://github.com/openshift/release/pull/54567
Description
Add rocm runtimes to runtime-images folder
Merge criteria: