Open swolchok opened 11 hours ago
Inspection of PRs landed between the last good build and first bad build suggested the following:
Trial revert of #6837 in #7013 still failed the job; trialing revert of the other two PRs together
trial revert of #6522 in https://github.com/pytorch/executorch/pull/7020 did not fix the job
trial revert of #6892 in #7021 did not fix the job.
I am also not able to repro this locally, and I've inspected git diff 8526d0a2d798658b6a6e3a42ec935b8093f355ef..04f6fcd4b3920eaf1be9905d12b449f301f89ca7
without finding anything else suspicious, so I wonder if the runners broke somehow
I wonder if the runners broke somehow
I reran the last good workflow run; builds succeeded (there were some failures due to an unrelated issue).
Found a failure with the same error message in a different job (test-llama-runner-mac
): https://github.com/pytorch/executorch/actions/runs/11959891658/job/33342737621?pr=7010
Found a failure with the same error message in a different job (
test-llama-runner-mac
): https://github.com/pytorch/executorch/actions/runs/11959891658/job/33342737621?pr=7010
that job is green on trunk runs though! https://hud.pytorch.org/hud/pytorch/executorch/main/1?per_page=50&name_filter=llama-runner-mac%20(fp32%2C%20mps
am late to this so not sure my comments will help, but any change related to xnnpack upgrade? since the job fails related xnnpack
@larryliu0820 suggested maybe the runner toolchain changed.
It looks like we're using macos-m1-stable runners for test-llama-runner-mac: https://github.com/pytorch/executorch/blob/main/.github/workflows/trunk.yml#L236 not sure what runner the wheel build uses
I don't know a whole lot about this runner type, but I see that 1) it seems to be in-house: https://github.com/pytorch/pytorch/issues/127490 2) I don't see recent activity in https://github.com/pytorch-labs/pytorch-gha-infra/ suggesting that there was a recent update
any change related to xnnpack upgrade
as I mentioned above, I inspected all the commits (there aren't many) in the range of commit hashes flagged in the nightly builds.
An example of trunk job passing:
https://github.com/pytorch/executorch/actions/runs/11962683652/job/33351640398
An example of PR job failing:
https://github.com/pytorch/executorch/actions/runs/11959891658/job/33342745520?pr=7010
I don't see obvious difference between these 2, regarding environment setup.
@huydhn anything obvious to you?
π Describe the bug
Status page: https://github.com/pytorch/executorch/actions/workflows/build-wheels-m1.yml Note that the Python 3.9 build always fails, so even though the runs are red, they were successful through 2024-11-18.
Linking is failing with
ld: invalid use of ADRP in '_init_f32_vcopysign_config' to '_xnn_f32_vcopysign_ukernel__neon_u8β
.Versions
N/A