AsahiLinux / asahi-installer

Asahi Linux installer
MIT License
782 stars 104 forks source link

bug: fixes - fresh install prevents system shutdown #291

Closed PaulCharlton closed 3 days ago

PaulCharlton commented 6 days ago

bug: fixes - fresh install prevents system shutdown

Fixes: https://github.com/AsahiLinux/asahi-installer/issues/288

see the content of the issue for a deep dive on the difference in behavior for macOS bless on mounted vs. unmounted volumes

PaulCharlton commented 5 days ago

Hmm ... that really should be "self.osi.data" -- the APFS data volume, not the APFS system volume in the volume group. Looking at clean macOS install, the NVMe parameter "boot-volume" has the UUID of the APFS Group / Data Volume, not the system volume. Going to review what results from "bless" against the system volume mount point.

Also, further reading for macOS booting:

https://eclecticlight.co/2021/05/31/m1-macs-have-a-third-recovery-mode/

https://eclecticlight.co/2022/06/29/startup-and-recovery-modes-on-m1-and-m2-macs/

https://eclecticlight.co/2021/01/21/system-management-and-nvram-on-m1-macs/

https://eclecticlight.co/2024/03/16/what-makes-a-disk-bootable/

https://eclecticlight.co/2024/04/01/checking-bootable-systems-using-bputil-on-apple-silicon/

PaulCharlton commented 5 days ago

Status update:

Should be good to go. I verified that the boot-volume NVMe parameter refers to the Volume Group which contains the specified mount point, and that the Data and System Volumes are both in the same Volume group

marcan commented 3 days ago

This makes no sense as a fix for the stated bug.

PaulCharlton commented 3 days ago

@marcan without the patch, my M2 MacBook Air goes into a boot loop after install quite consistently. there is a fundamental difference between bless --device vs bless --mount WHEN THE VOLUME IS MOUNTED, and it is mounted by the installer, and still mounted when the first shutdown occurs. It is likely an apple bug, but their documentation very clearly states "do not do this" (ie bless --device on a mounted volume)

marcan commented 3 days ago

Apple's documentation for bless is a confused mess, as evidenced by the containing a booter for EFI-based systems. line in the Apple Silicon section. The whole bless process has nothing to do with the legacy mechanism used for Intel devices. The purpose of the bless command is not to bless anything, it's just to change the boot volume (nvram var) to make 1TR work with our paired recoveryOS, and trigger a Bootability run (the actual process for Apple Silicon, not "blessing"). Everything on Apple Silicon works by mounting volumes anyway, since everything is file-based. The "should not be mounted" line was copypasta from the Intel section.

You can spec-lawyer this all you want, but there is zero evidence or logical reason why this completely superfluous change would have anything to do with your shutdown problem, which is pretty clearly an unrelated Apple problem or hardware defect in your machine.

PaulCharlton commented 3 days ago

What evidence would be acceptable to you? re:

there is zero evidence or logical reason why this completely superfluous change would have anything to do with your shutdown problem, which is pretty clearly an unrelated Apple problem or hardware defect in your machine.

My observation (not opinion) with over 3 dozen DFU wipes is that the patch is 100% consistent with the requested change. And, I have two MacBook Air M2 now -- bought the 2nd one just for triage -- 2 machines with identical problems? fresh from DFU clean?

Happy to make a video if that works as evidence.

marcan commented 3 days ago

Nobody else is experiencing this problem. Your job is to figure out what you're doing differently to trigger it, and then figure out if there is some way we can work around it (that has an explainable relationship to the problem) or it should be deferred to Apple.

Making random changes with no justification or connection to the bug "just because they work for you" is not how we do things. We need a root cause analysis or at least plausible justification for the change and how it fixes the bug.

My observation of your troubleshooting abilities is that you very easily confuse yourself about your observations and they do not match known facts about the systems (see the whole issue about 1TR and how it all works, where you described behavior which is inconsistent with everything we know about the system design, as confirmed by actual Apple engineers that worked on it, and I even went out of my way to try your process and confirmed it doesn't work at all). Therefore, absent an actual root-cause analysis here, the most logical conclusion is that you're confused and have other confounding factors that triggered your observed behavior change, unrelated to this change.