Automattic / hostmgr

A tool for managing macOS VM hosts
Mozilla Public License 2.0
8 stars 3 forks source link

Kill VM processes before unregister #66

Closed crazytonyli closed 9 months ago

crazytonyli commented 9 months ago

Issue

This PR fixes one of the most common issues in macOS CI.

Removing Registered VM xcode-15.0
Error: ShellOut encountered an error
Status code: 255
Message: "Failed to unregister the VM: Unable to perform the action because the virtual machine is busy. The virtual machine is currently running. Please try again later."
Output: ""
🚨 Error: The command exited with status 1

This error happens at the very beginning of CI jobs during launching the VM. Once this error occurs, the macOS agent is out of service until someone manually log onto the agent and fixes it. The fix is either restarting the agent or killing the VM process and then unregister the VM using prlctl.

Root cause

There are two symptoms in this issue:

  1. The VM is still running, which is mentioned in the error message.
  2. The VM is in "invalid" state, because its files have been deleted (likely by this code).

So the root cause is probably that the VM is not properly stopped and unregistered after the previous CI job finishes. The commands used by hostmgr are essentially prlctl stop <vm> --fast && prlctl unregister <vm>. Given this error occurs sparsely, it's possible that Parallels Desktop has a bug that fails to process these two commands correctly.

Changes

I want to make this issue disappear as soon as possible, because, once this error occurs, the teams' CI job is likely to be blocked, especially when there is no other working agent available. So I decided to add a step before unregister VM: killing all running VM process using pkill -9 -f 'Parallels VM\.app/Contents/MacOS/prl_vm_app', instead of diving deep into Parallels Desktop and understanding why it sometimes fails to processes the stop and unregister commands.