Closed Liam-Sturge closed 9 months ago
Huh, that definitely wasn't intended with that PR. The large number of commits was due to rebasing onto main, where I had expected that only the last commit on the rebased branch would be built, as the intermediate commits don't require testing.
This PR is also one that impacts packed_func.h
, which is included by practically everything in TVM. So, not only is it re-building TVM once for each rebased commit, but they're also full rebuilds that can't benefit from ccache
.
Short-term, I've stopped all tasks related to #16183, and the ARM queue is recovering. All remaining tasks are related to other PRs, as it works through the backlog.
Long-term, it looks like there's a Jenkins option disableConcurrentBuilds(abortPrevious: true)
(stack overflow link, GH link)) that we should enable. If there's two concurrent builds for the same PR, it would cancel the previous one.
Hi @Lunderberg, I too would have expected that only the last commit on the rebased branch would have been built. Really seems odd that this isn't the default behaviour here.
Thanks for looking in to it and getting the queue moving again. I agree that setting the Jenkins option disableConcurrentBuilds(abortPrevious: true)
sounds like a sensible idea. For now, I am happy to close this issue as resolved.
This issue also seems to have occurred on #16425
Jobs that require an arm agent are struggling to find an Graviton-3 executor in Jenkins. There are hundreds of builds queued for ci-arm. Some of the builds queued are stuck. This appears to have started on 8th February.
Branch/PR Failing
https://github.com/apache/tvm/pull/16183 may have caused the backlog of jobs. Recently a large amount of commits were pushed simultaneously for this PR and builds have been scheduled for each commit. I have attached a graph of the agent queue and allocation, which shows a big spike in queued jobs.
Jenkins Link
Build logs on these jobs indicate that some of the builds are stuck. They have ended with an unfinished status, but remain in the queue. See this console log as one example of many:
https://ci.tlcpack.ai/job/tvm-arm/job/PR-16183/495/console
Triage
Please refer to the list of label tags here to find the relevant tags and add them below in a bullet format (example below).
CC @lhutton1 @konturn @tqchen @yongwww