Closed huydhn closed 1 month ago
The latest updates on your projects. Learn more about Vercel for Git ↗︎
Close in favor of https://github.com/pytorch/pytorch/pull/142109
After chatting with @malfet, let try this one instead because https://github.com/pytorch/pytorch/pull/142109#pullrequestreview-2482524025 adds few more minutes to the workflow TTS
Hi @huydhn, I noticed that there are some failures in calculate-docker-image step in xpu ci test jobs, for example https://github.com/pytorch/pytorch/actions/runs/12198235184/job/34036392093?pr=140664#step:6:160. I suspect those failure related to this PR changes. Could you please help to double check it?
And another issue is that seems the build job spent more time than before, https://github.com/pytorch/pytorch/actions/runs/12198235184/job/34029552956?pr=140664#step:7:1. Is it expected?
@chuanqi129 Thank you for the fix in https://github.com/pytorch/pytorch/pull/142298! It's the correct fix. The failure you see actually highlight a problem that was hidden before. Without adding the new Docker image into the docker build workflow, the image will be rebuilt in every build and tests jobs that depend on it, which is a huge waste of time.
Let me take an action item to write a linter check for this to make sure that adding a new Docker images requires a corresponding update to the docker build workflow.
This is a short-term mitigation for https://github.com/pytorch/pytorch/issues/141885 in which any changes touching
.ci/docker
would cause all the builds to fail until docker build workflow finishes building the images.At the moment, we don't have a good way to tell the build workflow to wait for the new docker image, so my fix here attempts to inject a delay when the action is called by
_linux_build
. It will wait up to 90 minutes for the Docker build to finishTesting
https://github.com/pytorch/pytorch/pull/142177