Closed elvisizer closed 9 years ago
When the OS installation doesn't begin -- what does happen?
Is it possible something is causing the reboot to happen a long time after the install is set up?
If it's "too long" (for some internal-to-Apple definition of "too long", the automated install does not happen, and instead you boot to a UI that looks very similar to the one on the Recovery HD, where you can manually initiate the OS install (or run Disk Utility, etc)
We're not seeing that- the affected machines boot from their internal drives. I haven't been able to establish if this occurs on non-encrypted macs, but we did get a report of the boot behavior on a mac with FV2- on that mac, the FV2 disk unlock screen appeared after the reboot. The password field's pips were unusually small. After authenticating, they got the standard OS X login window. Since the small pips behavior is always present when booting into the installer, it seems like the disk unlock screen is trying to start the installer, but then something goes wrong and the OS on the hard drive boots instead.
Have you looked on the affected machines to see if the OS X install components are present? You'll find them in a folder named "OS X Install Data" at the root of the drive to be upgraded.
Do you have any other packages/scripts/processes that might be affecting the "bless" status of the startup disk before the machine is restarted? Do you have firmware password set on any of these machines? Anything that sets/alters the startup disk will affect/break the OS X install.
The install components are present. We don't have anything that's making any changes to the bless status or the startup disk- I did a sanity check just to be sure, and there's nothing in our environment touching that. No firmware passwords.
I am installing the OS pkg from a script, so I can gather some data using that script on the state of the machine after the pkg has been installed, but not rebooted yet. I'm just not sure where to start, or what to look at. Any suggestions for that?
got a VM right now that's doing it . . . . .Any ideas for things to check now that I've got my hands on an example of the issue?
here's the output of bless --info. I took this after the OS pkg installed, the mac rebooted, and did not begin the installation. I've also got the system.log for that boot if you would like to see it.
nvram -p output from the test example
And, here's the system.log . . . tried to guess which lines would be useful, if you need more from this file, let me know . . . .
Mar 11 16:03:28 localhost bootlog[0]: BOOT_TIME 1426115008 0
Mar 11 16:03:29 localhost syslogd[19]: Configuration Notice:
ASL Module "com.apple.appstore" claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
Mar 11 16:03:29 localhost syslogd[19]: Configuration Notice:
ASL Module "com.apple.authd" sharing output destination "/var/log/system.log" with ASL Module "com.apple.asl".
Output parameters from ASL Module "com.apple.asl" override any specified in ASL Module "com.apple.authd".
Mar 11 16:03:29 localhost syslogd[19]: Configuration Notice:
ASL Module "com.apple.authd" claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
Mar 11 16:03:29 localhost syslogd[19]: Configuration Notice:
ASL Module "com.apple.bookstore" claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
Mar 11 16:03:29 localhost syslogd[19]: Configuration Notice:
ASL Module "com.apple.eventmonitor" claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
Mar 11 16:03:29 localhost syslogd[19]: Configuration Notice:
ASL Module "com.apple.install" claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
Mar 11 16:03:29 localhost syslogd[19]: Configuration Notice:
ASL Module "com.apple.iokit.power" claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
Mar 11 16:03:29 localhost syslogd[19]: Configuration Notice:
ASL Module "com.apple.mail" claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
Mar 11 16:03:29 localhost syslogd[19]: Configuration Notice:
ASL Module "com.apple.MessageTracer" claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
Mar 11 16:03:29 localhost syslogd[19]: Configuration Notice:
ASL Module "com.apple.performance" claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
Mar 11 16:03:29 localhost syslogd[19]: Configuration Notice:
ASL Module "com.apple.securityd" claims selected messages.
Those messages may not appear in standard system log files or in the ASL database.
Mar 11 16:03:29 --- last message repeated 6 times ---
Mar 11 16:03:29 localhost kernel[0]: Longterm timer threshold: 1000 ms
Mar 11 16:03:29 localhost kernel[0]: PMAP: PCID enabled
Mar 11 16:03:29 localhost kernel[0]: PMAP: Supervisor Mode Execute Protection enabled
Mar 11 16:03:29 localhost kernel[0]: Darwin Kernel Version 13.4.0: Wed Dec 17 19:05:52 PST 2014; root:xnu-2422.115.10~1/RELEASE_X86_64
Mar 11 16:03:29 localhost kernel[0]: vm_page_bootstrap: 483251 free pages and 41037 wired pages
Mar 11 16:03:29 localhost kernel[0]: kext submap [0xffffff7f807ab000 - 0xffffff8000000000], kernel text [0xffffff8000200000 - 0xffffff80007ab000]
Mar 11 16:03:29 localhost kernel[0]: zone leak detection enabled
Mar 11 16:03:29 localhost kernel[0]: "vm_compressor_mode" is 4
Mar 11 16:03:29 localhost kernel[0]: standard timeslicing quantum is 10000 us
Mar 11 16:03:29 localhost kernel[0]: standard background quantum is 2500 us
Mar 11 16:03:29 localhost kernel[0]: mig_table_max_displ = 74
Mar 11 16:03:29 localhost kernel[0]: TSC Deadline Timer supported and enabled
Mar 11 16:03:29 localhost kernel[0]: AppleACPICPU: ProcessorId=0 LocalApicId=0 Enabled
Mar 11 16:03:29 localhost kernel[0]: AppleACPICPU: ProcessorId=1 LocalApicId=2 Enabled
Mar 11 16:03:29 localhost kernel[0]: calling mpo_policy_init for TMSafetyNet
Mar 11 16:03:29 localhost kernel[0]: Security policy loaded: Safety net for Time Machine (TMSafetyNet)
Mar 11 16:03:29 localhost kernel[0]: calling mpo_policy_init for Sandbox
Mar 11 16:03:29 localhost kernel[0]: Security policy loaded: Seatbelt sandbox policy (Sandbox)
Mar 11 16:03:29 localhost kernel[0]: calling mpo_policy_init for Quarantine
Mar 11 16:03:29 localhost kernel[0]: Security policy loaded: Quarantine policy (Quarantine)
Mar 11 16:03:29 localhost kernel[0]: Copyright (c) 1982, 1986, 1989, 1991, 1993
Mar 11 16:03:29 localhost kernel[0]: The Regents of the University of California. All rights reserved.
Mar 11 16:03:29 localhost kernel[0]: MAC Framework successfully initialized
Mar 11 16:03:29 localhost kernel[0]: using 10485 buffer headers and 7290 cluster IO buffer headers
Mar 11 16:03:29 localhost kernel[0]: AppleKeyStore starting (BUILT: Aug 17 2014 20:21:39)
Mar 11 16:03:29 localhost kernel[0]: IOAPIC: Version 0x11 Vectors 64:87
Mar 11 16:03:29 localhost kernel[0]: ACPI: sleep states S4 S5
Mar 11 16:03:29 localhost kernel[0]: pci (build 20:04:33 Aug 17 2014), flags 0x63008, pfm64 (42 cpu) 0x3ff80000000, 0x80000000
Mar 11 16:03:29 localhost kernel[0]: [ PCI configuration begin ]
Mar 11 16:03:29 localhost kernel[0]: mcache: 2 CPU(s), 64 bytes CPU cache line size
Mar 11 16:03:29 localhost kernel[0]: mbinit: done [64 MB total pool size, (42/21) split]
Mar 11 16:03:29 localhost kernel[0]: Pthread support ABORTS when sync kernel primitives misused
Mar 11 16:03:29 localhost kernel[0]: rooting via boot-uuid from /chosen: C453948D-EF70-38AB-B296-8F80F625339D
Mar 11 16:03:29 localhost kernel[0]: Waiting on
bless --info looks right.
nvram stuff: I wonder about the differing UUIDs in the efi-boot-device and the install-product-url; they seem to point to different disks.
Use diskutil list
to get a list of identifiers, thenL diskutil info diskXsY
for the each identifier to find the UUIDs and see how they match up to what you are seeing in the nvram settings.
here's the diskutil list output:
The UUID listed in nvram for the x-osproduct corresponds to /dev/disk1:
The other UUID (AF9B04AE-86E8-4D6B-AA8B-D2C87A5ABB5E) doesn't appear anywhere that I can find in diskutil's output for any device or disk.
I captured the nvram-p output from the same VM just before a successful run as well (yay for snapshots!) and it looks exactly the same as before an unsuccessful attempt:
One other random piece of information: I just noticed that after unlocking the disk, if the OS install is going to start, you get the Yosemite-style progress bar. If the OS install is NOT going to start, after unlocking the disk I get a Mavericks-style spinning gear.
Any ideas for next steps, Greg?
Nope! Unless you have a way to get me a copy of a VM (VMware Fusion format) that displays the behavior...
I've got the VM, just need a way to get it to you . . . .You aren't going to MacBrained tonight are you ;)?
No, that's a rather long drive from the LA area...
heh, well, it's big VM file since FV2 is enabled. Maybe it'd be easier to set up a 10.9 VM on your side, then I can send you our installer pkg and instructions for forcing the problem? I'd be fine with giving you my VM itself, just not sure how to get such a big file to you without shipping a USB drive.
If you have a repeatable, reliable way to reproduce the problem in a way I can look at it, I'm game to try your instructions.
ok, I'll write that up as soon as I can . . . should have that for you tomorrow. And thanks again for continuing to work with me on this, really appreciate it!!
Ok, I THINK I've solved this. It looks like the problem was due to uninstalling Symantec Endpoint Protection before running the OS install pkg from script on an encrypted Mac. The problem occurs only in that scenario. If I run the SEP uninstaller by double-clicking it's pkg in Finder, and then double-click the OS installer as well, everything works as expected. It's only if the 2 installers are run from a script together, and only on a mac that is encrypted. So, I'm moving the SEP uninstall to a launchd that'll execute AFTER the os installation. So far, no issues kicking off the os install when run from script as long as the SEP uninstaller isn't run first.
Follow up question: what is the SEP uninstaller doing that affects the OS X installer boot environment? A look at the package scripts (preinstall/postinstall) might reveal clues (or even a smoking gun)
It's a shell script from symantec that I've wrapped in a pkg, so there's just one script to vet. I took a look through it, and I really don't see what could be doing it. The most suspect part seems like it might be the function that re-writes the loginwindow.plist file, but that seems like a stretch. I don't see anything touching nvram, or anything related to the startup disk. Here's a link to the script: http://pastebin.com/gEMAkufc
So here's my guess:
The script removes several .kext files from /System/Library/Extensions and /Library/Extensions. This triggers a rebuild of the kextcaches. (this can take several minutes to complete.)
This rebuild is (sometimes/often/occassionally) still occurring when the OS X install pkg is doing its thing. Since the kextcache is involved in the boot process, it's possible that if the cache update concludes after the postflight script in the OS X install pkg runs, it affects the boot environment such that the OS X installer boot environment is no longer used and the "normal" OS X boot is back in play.
That would make sense . . . . Well, I think we're done here. Thanks again, Greg!
Hi greg, I'm seeing an intermittant issue with deploying 10.10.1 using createOSXinstallPkg. In some installation attempts, the package created by createOSXinstallPkg will install successfully, but when the machine is rebooted, the OS installation does not actually begin. So far, each case has been solved by running the same pkg again- on the 2nd try, the installation does run. We're seeing this with probably 5-10% of attempted installations. I haven't had any luck tracking down what's causing this in log files. Any ideas where I should start looking, or what might be going on here?