bazelbuild / continuous-integration

Bazel's Continuous Integration Setup
https://buildkite.com
Apache License 2.0
257 stars 134 forks source link

Upgrade macOS version and Xcode version for Mac machines #1431

Open meteorcloudy opened 1 year ago

meteorcloudy commented 1 year ago
meteorcloudy commented 1 year ago

Failing test on macOS 12.5.1: https://buildkite.com/bazel-testing/bazel-bazel/builds/8481#018330f9-71a1-4bc5-a8be-942f5c5ac3f9

BalestraPatrick commented 1 year ago

@meteorcloudy Is there any way we can install Xcode 14? It would unblock some tests in rules_apple. https://github.com/bazelbuild/continuous-integration/issues/1348#issuecomment-1408531680

fweikert commented 1 year ago

I'm looking into it

fweikert commented 1 year ago

I'm in the process of installing both 13.4.1 and 14.2. However, some of the machines require an OS update, too. I hope that the process will be finished by tomorrow EOD.

BalestraPatrick commented 1 year ago

Hey @fweikert! Thanks for doing this. I assume the rollout has been completed but the default version hasn't been changed, so we need to add an override?

I also noticed some failures starting Tuesday last week in our scheduled builds on our master branch: https://buildkite.com/bazel/rules-apple-darwin/builds?branch=master

It looks like the main issue is the that the tvOS SDK isn't installed anymore. We should likely make sure that every machine has all SDKs installed (with something like xcodebuild -downloadAllPlatforms for example). The error I'm seeing is: The operation couldn’t be completed. Failed to locate any simulator runtime matching options: { "com.apple.platform.appletvsimulator"

BalestraPatrick commented 1 year ago

We also realized that the we have a Activating Xcode 13.0... step which didn't fail, but subsequently we see DEBUG: /private/var/tmp/_bazel_buildkite/e6a8ceab815440f7ffea5ed855e0655d/external/bazel_tools/tools/osx/xcode_configure.bzl:243:14: No default Xcode version is set with 'xcode-select'; picking ':version14_2_0_14C18' in the log. This is likely because Xcode 13 can't run on macOS Ventura.

fweikert commented 1 year ago

Hey, yeah, unfortunately the infra update broke some of our tests (https://github.com/bazelbuild/bazel/labels/macos-infra-update).

Technically our CI should now default to Xcode 14.2 on Ventura, but apparently there is a bug that I still need to find. Can you explicitly specify 14.2 in your test config?

BalestraPatrick commented 1 year ago

I guess we could (provided the tvOS simulator runtimes are installed for that specific version), but not sure what other stuff will break in our repos.

brentleyjones commented 1 year ago

Is there no way to specify that we want to run on Monterey? Or is the whole fleet assumed to be Xcode 14.2/Ventura now?

fweikert commented 1 year ago

It's a bit tricky right now - all the iMacs run Ventura (platform macos), while all the MacStudios still run Monterey (macos_arm64). Our Monterey machines have Xcode versions 13.0, 13.4.1 and 14.2 installed.

fweikert commented 1 year ago

To elaborate a bit on the background: Right now CI fleet management is quite painful since we have to manually maintain 40 Macs of different generations (trash can Mac Pros, iMac Pros, MacStudios).

We're working on virtualizing the fleet, which would allow us to offer multiple different OS versions at the same time.

BalestraPatrick commented 1 year ago

We definitely have tests failing on arm64, so switching directly to Mac Studios might be tricky in the short-term. I'm seeing messages related to the simulator runtimes being a bit off on those machines as well:

Failed to find a suitable device for the type IBSimDeviceTypeiPad2x (com.apple.dt.Xcode.IBSimDeviceType.iPad-2x) with runtime iOS 15.0 (15.0 - 19A339) - com.apple.CoreSimulator.SimRuntime.iOS-15-0 (Failure reason: Failed to create SimDeviceSet at path /Users/buildkite/Library/Developer/Xcode/UserData/IB Support/Simulator Devices. You'll want to check the logs in ~/Library/Logs/CoreSimulator to see why creating the SimDeviceSet failed.): Failed to initialize simulator device set. (Failure reason: Failed to subscribe to notifications from CoreSimulatorService.): Unable to determine SimDeviceSet, set_path=(null): Failed to initialize simulator device set. (Failure reason: Allocation or initialization failed.)

I think the best path forward would making sure the simulator runtimes and SDKs are installed properly on all machines, and then we can proceed to see which option gets us back to green asap (I'm thinking Xcode 14.0 running on Ventura at the moment).

meteorcloudy commented 1 year ago

making sure the simulator runtimes and SDKs are installed properly on all machines

I guess neither me or Florian is very familiar with this. Can you give some instructions on how to do this?

@fweikert Can you try to run xcrun simctl list and check what's the output?

BalestraPatrick commented 1 year ago

I think following this guide should be a good start: https://developer.apple.com/documentation/xcode/installing-additional-simulator-runtimes

Running xcodebuild -downloadAllPlatforms should at least provide all the SDKs that we need (watchOS, tvOS) that are optional in Xcode 14 and not bundled with Xcode anymore by default.

fweikert commented 1 year ago

I ran xcodebuild -downloadAllPlatforms on all iMacs. I can see that we got tvOS 16.1 and watchOS 9.1:

% xcodebuild -showsdks               
DriverKit SDKs:
    DriverKit 22.2                  -sdk driverkit22.2

iOS SDKs:
    iOS 16.2                        -sdk iphoneos16.2

iOS Simulator SDKs:
    Simulator - iOS 16.2            -sdk iphonesimulator16.2

macOS SDKs:
    macOS 13.1                      -sdk macosx13.1
    macOS 13.1                      -sdk macosx13.1

tvOS SDKs:
    tvOS 16.1                       -sdk appletvos16.1

tvOS Simulator SDKs:
    Simulator - tvOS 16.1           -sdk appletvsimulator16.1

watchOS SDKs:
    watchOS 9.1                     -sdk watchos9.1

watchOS Simulator SDKs:
    Simulator - watchOS 9.1         -sdk watchsimulator9.1

However, xcrun simctl list still shows many "Unavailable" entries such as the aforementioned com.apple.CoreSimulator.SimRuntime.iOS-15-0.

BalestraPatrick commented 1 year ago

I think that's good enough, we might not need other runtimes that are not bundled with Xcode by default (I don't think we run tests on lower iOS versions in our repos). Looks like I was able to get a rules_apple PR that switches us to Xcode 14.2 on the iMacs green now:

I only noticed one issue on bk-imacpro-6 (which I didn't see on other iMacs running the same build):

    Failure Reason: Failed to spawn AssetCatalogSimulatorAgent on Apple Watch Series 7 (45mm) (9BAC5919-45D6-4DDD-AB6A-AAAF73F5FB43, (null), Shutdown)

It looks like the "Apple Watch Series 7 (45mm)" simulator is not correctly created. If you run xcrun simctl list and you see that device on that machine, maybe it was just a temporary error. Otherwise something might be different on that machine vs the other iMacs (for example this build on bk-imacpro-19 was successful).

fweikert commented 1 year ago

I can see Apple Watch Series 7 (45mm) (com.apple.CoreSimulator.SimDeviceType.Apple-Watch-Series-7-45mm) under "Devices" on bk-imacpro-6. However, the watchOS 9.1 runtime contains an entry with a different value: Apple Watch Series 7 (45mm) (7FF9613B-A581-4703-B2D9-DF589ACAB733) (Shutdown)

BalestraPatrick commented 1 year ago

@fweikert I'm still seeing issues with the Apple Watch simulators. If I run xcrun simctl list devices available on a build agent, I only get iOS devices:

== Devices ==
-- iOS 16.2 --
    iPhone 14 (438498C7-2E9A-4A4A-9CC7-5DED49FAC8C4) (Shutdown)
    iPhone 14 Plus (BB68AFF0-037F-4B23-9A0B-2C5DB99D743A) (Shutdown)
    iPhone 14 Pro Max (94EE6ADE-C7B7-4E76-9DFD-78B8B9229449) (Shutdown)
    iPad Air (5th generation) (710C70AD-5DE4-4A57-846F-132519AB808D) (Shutdown)
    iPad Pro (11-inch) (4th generation) (9F4F019B-BBDA-4C8D-84B0-73649107A41F) (Shutdown)
-- tvOS 16.1 --
-- watchOS 9.1 --

Maybe since the watchOS and tvOS simulators were installed post-Xcode install, we need to create some defaults manually?

fweikert commented 1 year ago

Which machine was that?

BalestraPatrick commented 1 year ago

My latest build is here. Reproduced it on imacpro-19, imacpro-5 and imacpro-16.

fweikert commented 1 year ago

https://buildkite.com/bazel/rules-apple-darwin/builds/6669#01864a45-7a3e-42d7-befe-dd6cbb024800 on imacpro-5 fails with Failed to spawn AssetCatalogSimulatorAgent on Apple Watch Series 7 (45mm) (9C9DE839-E2E7-4734-893F-1ED67D94A37C, (null), Shutdown)

However, when I ssh into imacpro-5 it looks ok:

== Devices ==
-- iOS 16.2 --
    iPhone SE (3rd generation) (23077071-4176-435F-98BA-39F5499042B0) (Shutdown) 
    iPhone 14 (37FA8814-1FD8-480A-8F58-33C4617A73F1) (Shutdown) 
    iPhone 14 Plus (428DB8C2-202C-49AA-8608-4D27D7CF17A3) (Shutdown) 
    iPhone 14 Pro (5CB291A7-2186-4228-AA88-4CFD5A1644F9) (Shutdown) 
    iPhone 14 Pro Max (1711DFB0-CE56-4BE9-A339-C8FD18A0D2E0) (Shutdown) 
    iPad Air (5th generation) (67AADEA6-205E-4D90-B7C3-AC0ED2BAB359) (Shutdown) 
    iPad (10th generation) (AD32E8BD-DD0F-4160-A427-3647665E8050) (Shutdown) 
    iPad mini (6th generation) (17BB1E83-17A4-4A71-B171-0458FDD07149) (Shutdown) 
    iPad Pro (11-inch) (4th generation) (A56AFB3B-B499-4324-8EE8-C3D12EC7345D) (Shutdown) 
    iPad Pro (12.9-inch) (6th generation) (50D57211-FFF5-4C72-909E-C1C08EA748CE) (Shutdown) 
-- tvOS 16.1 --
    Apple TV (4843400E-199F-4036-ADD0-8C104D2FD4BD) (Shutdown) 
    Apple TV 4K (3rd generation) (608D000A-BA58-41DA-9F66-7393C7971FEC) (Shutdown) 
    Apple TV 4K (3rd generation) (at 1080p) (795E3ACD-E19C-42EA-B891-B75E42C15B0C) (Shutdown) 
-- watchOS 9.1 --
    Apple Watch Series 5 (40mm) (B848B04C-069B-4233-9B82-A356213D0286) (Shutdown) 
    Apple Watch Series 5 (44mm) (0C433A12-7EFE-40AA-BB3C-B89534F1AD1B) (Shutdown) 
    Apple Watch Series 6 (40mm) (2FB53684-00DF-4AA7-8DB1-D8C75DE12E41) (Shutdown) 
    Apple Watch Series 6 (44mm) (B809C122-7EA6-4459-AD86-573F62C89A95) (Shutdown) 
    Apple Watch Series 7 (41mm) (3CC797C2-E64F-4BB4-BDFC-0E07F57C6A88) (Shutdown) 
    Apple Watch Series 7 (45mm) (FD37414F-5819-4070-9E78-ED8A3747C3B7) (Shutdown) 
    Apple Watch SE (40mm) (2nd generation) (D571509A-4DDD-47DF-88BB-44F947EA44AC) (Shutdown) 
    Apple Watch SE (44mm) (2nd generation) (C756171C-DA97-4337-9F98-8A74B94902BE) (Shutdown) 
    Apple Watch Series 8 (41mm) (7C377419-8B05-4A3F-AE47-5AB686ED3BAD) (Shutdown) 
    Apple Watch Series 8 (45mm) (D0DBE70A-F153-410F-A8E8-52E5A46CD6A1) (Shutdown) 
    Apple Watch Ultra (49mm) (F8943ED9-75A3-404D-A696-4B1B0BF68FE6) (Shutdown) 

Note that the UDIDs are different. Maybe it's an issue that we're using a dedicated CI user account? I'm not sure.

BalestraPatrick commented 1 year ago

What I could find from my research was that it's either some permission issue, or broken Xcode installation. Some people suggested deleting /Library/Developer/ for example first and later reinstall Xcode. I saw this failure on a schedule master build on Saturday, so I'm fairly sure it's happening on some agents on builds that aren't cached: https://buildkite.com/bazel/rules-apple-darwin/builds/6653#01863dc8-f095-4f31-be49-d1dbf22b08dc

keith commented 1 year ago

Issue with a missing android setup here https://buildkite.com/bazel/bazel-bazel-github-presubmit/builds/14315#018651c9-b37a-4443-bbee-700e877557ed

fweikert commented 1 year ago

Issue with a missing android setup here https://buildkite.com/bazel/bazel-bazel-github-presubmit/builds/14315#018651c9-b37a-4443-bbee-700e877557ed

That's unrelated to Xcode - for some reason imac11 randomly decided to delete its Android SDKs. I reinstalled them earlier today.

keith commented 1 year ago

more issues https://buildkite.com/bazel/rules-apple-darwin/builds/6689#01865773-20b1-4c30-b629-4b9ad316132d

brentleyjones commented 1 year ago

Yeah, getting a flakey test (covered in the above link as well): https://buildkite.com/bazel/rules-apple-darwin/builds/6720#0186754c-b8b7-49d2-b46d-7184fbe0fb7a

keith commented 1 year ago

looks like machine number 1 doesn't have all the platforms installed so rules_apple tests don't work https://buildkite.com/bazel/bcr-presubmit/builds/1060#018675e2-6e91-46e8-b1d9-b96fde6531b1

can someone run xcodebuild -downloadAllPlatforms on it?

keith commented 1 year ago

same with #7, maybe all of them need this?

meteorcloudy commented 1 year ago

Tried to run xcodebuild -downloadAllPlatforms on all imacpros, getting the following error on machine 1 and a few other machines:

ci@bk-imacpro-1 ~ % xcodebuild -downloadAllPlatforms
Downloading tvOS 16.1 Simulator (20K67): Error: Error Domain=SimDiskImageErrorDomain Code=5 "Duplicate of C343EBAA-CF3F-4397-A071-388674B05A31" UserInfo={NSLocalizedDescription=Duplicate of C343EBAA-CF3F-4397-A071-388674B05A31, unusableErrorDetail=}
fweikert commented 1 year ago

I ran xcodebuild -downloadAllPlatforms on all iMacs two weeks ago. This is really weird.

BalestraPatrick commented 1 year ago

I've seen that locally in the past on my machine and I think the solution is to just remove all sim runtimes and start from scratch with a default set by running that xcodebuild -downloadAllPlatforms command.

keith commented 1 year ago

similar issues still https://buildkite.com/bazel/rules-apple-darwin/builds/6742#01867f14-772f-49f5-8867-960c2cee9762

fweikert commented 1 year ago

This build is interesting: "Last Green Bazel" failed on imacpro-18 (1st retry), then eventually succeeded on that machine (4th retry).

The best solution seems to be to remove and re-install:

ci@bk-imacpro-14 ~ % xcodebuild -downloadAllPlatforms Downloading watchOS 9.1 Simulator (20S75): Error: Error Domain=SimDiskImageErrorDomain Code=5 "Duplicate of 92459430-0207-4A7C-A7DA-ABF3CABE63F4" UserInfo={NSLocalizedDescription=Duplicate of 92459430-0207-4A7C-A7DA-ABF3CABE63F4, unusableErrorDetail=} Downloading tvOS 16.1 Simulator (20K67): Error: Error Domain=SimDiskImageErrorDomain Code=5 "Duplicate of EBF708D5-508B-4C93-89A1-7301A1A415C6" UserInfo={NSLocalizedDescription=Duplicate of EBF708D5-508B-4C93-89A1-7301A1A415C6, unusableErrorDetail=}

keith commented 1 year ago

@fweikert we're still seeing some issues with this flakiness across multiple machines https://buildkite.com/bazel/rules-apple-darwin/builds/6892#0186ccaf-99fd-484a-9f91-dd470edf8a19, is there any way we could debug more live sometime? This is making our CI pretty unusable since all tests rely on these various features.

fweikert commented 1 year ago

@keith thanks for reporting. I'm back from my leave and will have a look.

brentleyjones commented 1 year ago

FYI, still having issues: https://buildkite.com/bazel/rules-apple-darwin/builds/6953#018704bc-8361-4fce-a712-5bf41faa052b

fweikert commented 1 year ago

Looking at "passed" rules_apple builds we can see an increase in flakiness (most recent builds to older builds): https://i.imgur.com/evWOXij.png

I was hoping that there would be consistent infra failures, but well... I guess we should reinstall the SDK on all machines.

Agent history for rules_apple (including failed builds): https://i.imgur.com/ASZgO5E.png

brentleyjones commented 1 year ago

This is a new weird flake (https://buildkite.com/bazel/rules-apple-darwin/builds/6973#01870ed5-abd5-46bf-82f1-45218af7eaed):

2023-03-23 14:23:36,039 Created new simulator 06918F6B-5AA3-4761-B731-85821E0D4E94.
2023-03-23 14:23:36,040 Will consider the test as test type Logic Test to run. Because the app under test is not given.
An error was encountered processing the command (domain=NSCocoaErrorDomain, code=513):
You don’t have permission to save the file “06918F6B-5AA3-4761-B731-85821E0D4E94” in the folder “CoreSimulator”.
You don’t have permission.
To view or change permissions, select the item in the Finder and choose File > Get Info.
Underlying error (domain=NSPOSIXErrorDomain, code=1):
    The operation couldn’t be completed. Operation not permitted
    Operation not permitted
brentleyjones commented 1 year ago

Another flake (this time BCR, but still rules_apple): https://buildkite.com/bazel/bcr-presubmit/builds/1150#018714cc-526e-4a84-859a-82eae691b31a. I'll assume they will continue until you post an update @fweikert, so I stop spamming you 😄.

fweikert commented 1 year ago

Sorry, too many things going on right now :(

The BCR failure is interesting:

Failed to locate any simulator runtime matching options: {
--
  | VersionString = "9.1";
  | }
  | "com.apple.platform.watchsimulator"
  | );
  | BuildVersionString = 20S71;

iMac-4 has Xcode 14.2 (Build version 14C18) installed, which should come with watchOS 9.1 (20S71) (if we can trust https://xcodereleases.com/).

However, xcrun simctl runtime list on iMac4 produces

== Disk Images ==
-- iOS --
iOS 16.2 (20C52) - 150F32EA-821D-470F-AD21-371E0ABA7106 (Ready)
-- tvOS --
tvOS 16.1 (20K67) - 7BB18E8E-48E8-4D3A-B98A-BEE7A3B9314B (Unusable - Other Failure: Duplicate of E9C442B7-2988-4CDD-9054-C1D70C0CF51F)
tvOS 16.1 (20K67) - E9C442B7-2988-4CDD-9054-C1D70C0CF51F (Ready)
-- watchOS --
watchOS 9.1 (20S75) - D14F19C7-E1C8-4952-AEC6-845AB776FBFE (Ready)
watchOS 9.1 (20S75) - 4FE4DF4F-FEBE-4447-B30D-71886F57C58D (Unusable - Other Failure: Duplicate of D14F19C7-E1C8-4952-AEC6-845AB776FBFE)

Total Disk Images: 5 (16.7G)

Not sure why we ended up with 20S75 (and why there are duplicate entries).

BalestraPatrick commented 1 year ago

@fweikert We have run into similar problems in our CI. 20S75 I believe is an update to the SDK that was released after Xcode 14.2. Xcode 14.2 contains the version 20S71 as specified on https://xcodereleases.com/. We filed this ticket to make it clearer on their website that it can happen that if you download the runtimes separately, you might end up with a different version: https://github.com/xcodereleases/xcodereleases.com/issues/32

I think the solution is to remove the runtimes, and simply run the command once to install the latest versions for each platform.

BalestraPatrick commented 1 year ago

For example, I'm still seeing failures because some machines don't have any runtime installed.

At the start of my PR, I'm running xcrun simctl list devices available to see what runtimes and devices are available.

fweikert commented 1 year ago

I'm seeing different results when running xcrun simctl list devices available manually on these machines. It might be because we're using a different user for maintenance than for the builds itself, although my understanding was that the Xcode installation should be visible to all users on that machine (the binary lives in /Applications). Is this different for the runtimes?

imacpro-1:

== Devices ==
-- iOS 16.2 --
    iPhone 8 (309A00AB-3C3D-4F3B-8286-C3DD2EA69964) (Shutdown) 
    iPhone 8 Plus (7BBAB773-68DB-4D42-A8B1-487C278F5088) (Shutdown) 
    iPhone 11 (77FB5350-5336-4FD5-AA28-884862F8E6EC) (Shutdown) 
    iPhone 11 Pro (2C27CBCD-4F90-433B-9D6E-C176AF63D8B1) (Shutdown) 
    iPhone 11 Pro Max (FB26EB38-DD8A-4722-B68D-304702381E9A) (Shutdown) 
    iPhone SE (2nd generation) (9B3D91D5-79AB-4150-92DC-4254A6C4875B) (Shutdown) 
    iPhone 12 mini (B3852554-1853-4E66-A4AA-EFC3D9275FB5) (Shutdown) 
    iPhone 12 (66039AB8-621D-4864-A023-B6E5078388A7) (Shutdown) 
    iPhone 12 Pro (D639A0E9-4946-469E-A607-A469A00DA6BB) (Shutdown) 
    iPhone 12 Pro Max (46F8FE23-76FA-411F-9254-05859072EADC) (Shutdown) 
    iPhone 13 Pro (13462384-8A41-43D9-B8DA-A9A50E061447) (Shutdown) 
    iPhone 13 Pro Max (24114FCA-F377-4CEC-B580-DC83E65FC5C2) (Shutdown) 
    iPhone 13 mini (01CE419F-CC52-4B11-8825-8D9E6F824BEA) (Shutdown) 
    iPhone 13 (153F21CF-DDB4-4B7E-82AB-8282F29443E7) (Shutdown) 
    iPhone SE (3rd generation) (5B7EDCCE-A458-4243-8F91-041C23F1A142) (Shutdown) 
    iPhone 14 (F3F43CC4-5AFD-4975-88B0-82EF620917E6) (Shutdown) 
    iPhone 14 Plus (21417FB2-F6BD-46B4-B412-03953422BD5A) (Shutdown) 
    iPhone 14 Pro (CF784A12-4CED-4896-9AD5-D851B89C96A8) (Shutdown) 
    iPhone 14 Pro Max (44F91239-E8EA-47A5-97BE-253B208D37DB) (Shutdown) 
    iPad Pro (9.7-inch) (7DCD608D-B9DD-4946-B6F9-1BE5451AFE00) (Shutdown) 
    iPad (9th generation) (DDD616A4-38F9-4825-BA01-355031C2D15A) (Shutdown) 
    iPad Air (4th generation) (98A1C25F-359B-4576-91C2-1C6C12752C52) (Shutdown) 
    iPad Pro (11-inch) (3rd generation) (E3093ABA-376B-46E8-9FF2-068561725DC9) (Shutdown) 
    iPad Pro (12.9-inch) (5th generation) (3C363390-7951-4A48-B946-E28225FA6F77) (Shutdown) 
    iPad Air (5th generation) (373DA876-0339-47F9-93A1-2C978FDC1A17) (Shutdown) 
    iPad (10th generation) (10ECED45-D366-4322-98C3-FF14B6D92A15) (Shutdown) 
    iPad mini (6th generation) (26A08E37-1D36-49DE-BD17-28FB495827CE) (Shutdown) 
    iPad Pro (11-inch) (4th generation) (EBB6C580-40A3-4F0B-B4AB-A4C8EF8F1681) (Shutdown) 
    iPad Pro (12.9-inch) (6th generation) (70BACEBC-089C-4B08-B747-FBEDA08E5F07) (Shutdown) 
-- watchOS 9.1 --
    Apple Watch Series 5 (40mm) (2B3D09E9-D6AF-4978-A098-07265E986C25) (Shutdown) 
    Apple Watch Series 5 - 44mm (164AA667-A29B-4DA4-A382-A8EC8383C57F) (Shutdown) 
    Apple Watch Series 6 (40mm) (FB213A9D-7D16-475B-B7F3-1D3282497F8C) (Shutdown) 
    Apple Watch Series 6 (44mm) (7F1B5C7F-E03D-45A4-A821-1436E67039FF) (Shutdown) 
    Apple Watch Series 7 - 41mm (3BF9DCF5-2F48-4A75-A6A8-D3ECEA40306F) (Shutdown) 
    Apple Watch Series 7 - 45mm (A329EF5C-084B-4B28-A781-6E563ED624B2) (Shutdown) 
    Apple Watch SE (40mm) (2nd generation) (AA5BDB35-4B09-4289-981E-7234FBE340DC) (Shutdown) 
    Apple Watch SE (44mm) (2nd generation) (5510B750-9589-449B-A6D8-5C9DDB825BCD) (Shutdown) 
    Apple Watch Series 8 (41mm) (A72B2C42-44FD-4E95-9113-E0E062218699) (Shutdown) 
    Apple Watch Series 8 (45mm) (CF2BDCEA-6EC0-40BE-9A44-640C31792BAD) (Shutdown) 
    Apple Watch Ultra (49mm) (D5803729-4125-4EB8-9A31-34523D535CA6) (Shutdown) 
-- Unavailable: com.apple.CoreSimulator.SimRuntime.iOS-13-4 --
-- Unavailable: com.apple.CoreSimulator.SimRuntime.iOS-13-5 --
-- Unavailable: com.apple.CoreSimulator.SimRuntime.iOS-15-0 --
-- Unavailable: com.apple.CoreSimulator.SimRuntime.tvOS-13-4 --
-- Unavailable: com.apple.CoreSimulator.SimRuntime.tvOS-15-0 --
-- Unavailable: com.apple.CoreSimulator.SimRuntime.tvOS-16-1 --
-- Unavailable: com.apple.CoreSimulator.SimRuntime.watchOS-6-2 --
-- Unavailable: com.apple.CoreSimulator.SimRuntime.watchOS-8-0 --

imacpro-13 seems to be missing the runtimes:

== Devices ==
-- iOS 16.2 --
    iPhone SE (3rd generation) (E915F95F-CD85-4A78-8E2E-690928204C2F) (Shutdown) 
    iPhone 14 (D6B4D9A1-F6B9-446C-B240-BF54E243267A) (Shutdown) 
    iPhone 14 Plus (91CA3EFF-45E1-4612-B698-54D31AEDADA5) (Shutdown) 
    iPhone 14 Pro (D9F1FBE7-AD3C-4AC5-B095-E7A93C41269E) (Shutdown) 
    iPhone 14 Pro Max (74D20C36-3509-4698-A70C-E3FAB292A2B4) (Shutdown) 
    iPad Air (5th generation) (29208E0C-A62D-4DB7-AFF2-E9825852638A) (Shutdown) 
    iPad (10th generation) (74BF4663-F9CE-4CBF-9052-9AC41771F06F) (Shutdown) 
    iPad mini (6th generation) (178A6B34-A893-4F12-8D86-EBB5711DDA31) (Shutdown) 
    iPad Pro (11-inch) (4th generation) (D9A24E61-E9D2-4366-9DAF-FA09DE3C299A) (Shutdown) 
    iPad Pro (12.9-inch) (6th generation) (9B12A04C-A4E7-4B75-B137-4064A078F02F) (Shutdown) 
BalestraPatrick commented 1 year ago

@fweikert Those are only devices, there should be a == Runtimes == section below in the same log.

fweikert commented 1 year ago

Sorry, my bad.

Results of xcrun simctl list runtimes available:

imac1:

== Runtimes ==
iOS 15.0 (15.0 - 19A339) - com.apple.CoreSimulator.SimRuntime.iOS-15-0
tvOS 15.0 (15.0 - 19J344) - com.apple.CoreSimulator.SimRuntime.tvOS-15-0
watchOS 8.0 (8.0 - 19R344) - com.apple.CoreSimulator.SimRuntime.watchOS-8-0
watchOS 9.1 (9.1 - 20S75) - com.apple.CoreSimulator.SimRuntime.watchOS-9-1

imac13:

== Runtimes ==
iOS 15.0 (15.0 - 19A339) - com.apple.CoreSimulator.SimRuntime.iOS-15-0
tvOS 15.0 (15.0 - 19J344) - com.apple.CoreSimulator.SimRuntime.tvOS-15-0
watchOS 8.0 (8.0 - 19R344) - com.apple.CoreSimulator.SimRuntime.watchOS-8-0

imac20:

== Runtimes ==
iOS 15.0 (15.0 - 19A339) - com.apple.CoreSimulator.SimRuntime.iOS-15-0
tvOS 15.0 (15.0 - 19J344) - com.apple.CoreSimulator.SimRuntime.tvOS-15-0
tvOS 16.1 (16.1 - 20K67) - com.apple.CoreSimulator.SimRuntime.tvOS-16-1
watchOS 8.0 (8.0 - 19R344) - com.apple.CoreSimulator.SimRuntime.watchOS-8-0
watchOS 9.1 (9.1 - 20S75) - com.apple.CoreSimulator.SimRuntime.watchOS-9-1

This is consistent with the latest failures - I'll try to install the runtimes on all machines again.

fweikert commented 1 year ago

Ugh, this is painful. The installation of missing runtimes fails because there are multiple duplicate simulators. Removing them and then installing all missing runtimes via the Xcode UI works, but is annoying.

Instead I'm running xcrun simctl runtime delete all followed by xcodebuild -downloadAllPlatforms on all iMac Pros.

This means that the iMac fleet now offers the following runtimes for Xcode 14.2:

iOS 16.2 (16.2 - 20C52) - com.apple.CoreSimulator.SimRuntime.iOS-16-2
tvOS 16.1 (16.1 - 20K67) - com.apple.CoreSimulator.SimRuntime.tvOS-16-1
watchOS 9.1 (9.1 - 20S75) - com.apple.CoreSimulator.SimRuntime.watchOS-9-1

I had to fix -1, -3, -4, -5, -6, -7, -13, -14, -15 and -18. -16 has network problems and is unreachable. However, I still have absolutely no idea how this could happen - we've always run the same set of commands on all machines (via clusterssh).

BalestraPatrick commented 1 year ago

@fweikert The watchOS and tvOS runtime issues are fixed, but unfortunately I'm still seeing something similar to the previous issues reported:

Failure Reason: Failed to spawn AssetCatalogSimulatorAgent on Apple Watch Series 7 (45mm) (0B7D26F8-200B-4752-AE6A-7196CA01BE28, (null), Shutdown)

After nuking the cache, our builds fail on latest master, as this PR demonstrates: https://github.com/bazelbuild/rules_apple/pull/1935/

It's pretty difficult for me to debug because I can't find a way to publish logs and artifacts after the build has finished, so I was wondering if you could help by inspect the machines after the builds have run. I'm hoping that ~/Library/Logs/CoreSimulator/CoreSimulator.log or ~/Library/Logs/CoreSimulator/Simulator.log will contain more details after the failure has occurred. I can't reproduce this locally at all.

One theory I have is that all these machines still have Xcode 13 and 13.4.1 installed even though they can't run on the macOS version installed. Those versions should be safely removed, but I'm still unsure if that's the only culprit.

fweikert commented 1 year ago

The tests actually upload the log file: https://storage.googleapis.com/bazel-untrusted-buildkite-artifacts/01874750-cb3c-41e1-acfe-5ec6db2d4341/Users/buildkite/Library/Logs/CoreSimulator/CoreSimulator.log

The idea wrt Xcode 13 sounds good, I removed all older versions from all iMacs (and I'll update the CI script to prevent illegal Xcode <-> MacOS combinations).

BalestraPatrick commented 1 year ago

@fweikert Thanks for removing Xcode 13. Yeah, my attempt there only uploads the log before the build starts, but that unfortunately doesn't contain the logs since that happens later in the build. I couldn't find a way to upload the log after the build has completed.

BalestraPatrick commented 1 year ago

Looks like removing Xcode 13 wasn't enough to fix this: https://buildkite.com/bazel/rules-apple-darwin/builds/7062#018747d0-b6bf-463d-a60d-4c3464903430