bruienne / bsdpy

An Apple NetBoot server implemented in Python
Apache License 2.0
221 stars 43 forks source link

bsdy linux self-contained, seems to be configured correctly but laptops just don't netboot #9

Open ghost opened 6 years ago

ghost commented 6 years ago

I had had a working docker-based installation of this software. For a couple reasons, I am trying to migrate to a self-contained linux host installation.

I've installed all the prerequisites and individually, they seem to work.

Macbooks booted to the netboot screen see the images on the network.

However, choosing an image, the laptop fails to boot from it. I've run out of troubleshooting steps and am looking for what I'm missing; for example, is there additional verbosity I can add somewhere so I can see what's going on in a log?

My environment is as follows

BSDPY_IP="10.5.0.30"
BSDPY_HOST="netboot01"  # This resolves to 10.5.0.30 on this network
BSDPY_IFACE="enp1s0"
BSDPY_ROOT="/opt/bsdpy"
BSDPY_PROTO="http"
BSDPY_INTERFACE="${BSDPY_IFACE}"
DOCKER_BSDPY_IP="${BSDPY_IP}"
DOCKER_BSDPY_NBI_URL="http://${BSDPY_HOST}/nbi"

TFTP, HTTP both work. I can get images manually via a tftp client. I can reach them over HTTP. I matched up the paths reported by bsdpserver.py and the paths are correct.

Here is example startup output from bsdpserver.py

-=- Starting new BSDP server session -=-

04/11/2018 12:32:24 PM - DEBUG: Considering NBI source at /nbi/netinstall-of-macos-sierra-not-configured.nbi
04/11/2018 12:32:24 PM - DEBUG: Considering NBI source at /nbi/netrestore-of-macos-sierra.nbi
04/11/2018 12:32:24 PM - DEBUG: [========= Using the following boot images =========]
04/11/2018 12:32:24 PM - DEBUG: /nbi/netinstall-of-macos-sierra-not-configured.nbi
04/11/2018 12:32:24 PM - DEBUG: /nbi/netrestore-of-macos-sierra.nbi
04/11/2018 12:32:24 PM - DEBUG: [=========     End boot image listing      =========]

When I start up a macboom and choose an image, I get the following:

04/11/2018 11:41:13 AM - DEBUG: -=========================================-
04/11/2018 11:41:13 AM - DEBUG: Got BSDP INFORM[SELECT] packet:
04/11/2018 11:41:13 AM - DEBUG: Determining image list for system ID MacBookPro14,3
04/11/2018 11:41:13 AM - DEBUG: Found enabled system ID MacBookPro14,3 - adding "NetInstall of Unconfigured macOS 10.12.6 (16G29) Install (12.39 GB)." to list
04/11/2018 11:41:13 AM - DEBUG: Found enabled system ID MacBookPro14,3 - adding "NetRestore of macOS 10.12.6 (16G1114) Volume (18.69 GB)." to list
04/11/2018 11:41:13 AM - DEBUG: *********
Refreshing basedmgpath because DOCKER_BSDPY_NBI_URL uses hostname, not IP
04/11/2018 11:41:13 AM - DEBUG: Resolving hostname to IP - netboot01 -> 10.5.0.30
04/11/2018 11:41:13 AM - DEBUG: Found DOCKER_BSDPY_NBI_URL - using basedmgpath http://10.5.0.30/nbi/
*********

04/11/2018 11:41:13 AM - DEBUG: -=========================================-
04/11/2018 11:41:13 AM - DEBUG: Return ACK[SELECT] to 10.5.1.25 on 68
04/11/2018 11:41:13 AM - DEBUG: --> TFTP path: /nbi/netrestore-of-macos-sierra.nbi/i386/booter
-->Boot image URI: http://10.5.0.30/nbi/netrestore-of-macos-sierra.nbi/NetInstall.dmg
04/11/2018 11:54:11 AM - DEBUG: Found $DOCKER_BSDPY_IP - using custom external IP 10.5.0.30
04/11/2018 11:54:11 AM - DEBUG: Resolving hostname to IP - netboot01 -> 10.5.0.30
04/11/2018 11:54:11 AM - DEBUG: Found DOCKER_BSDPY_NBI_URL - using basedmgpath http://10.5.0.30/nbi/
04/11/2018 11:54:11 AM - DEBUG: Server IP: 10.5.0.30

However, laptops don't boot. Any advice?

MScottBlake commented 6 years ago

Are you seeing any debug logging when a computer is viewing the netboot options or when an nbi is selected?

ghost commented 6 years ago

Here's the entire log from bsdpserver.py (/var/log/bsdpserver.log). I've only pruned out parts that repeat.

The session that's logged here is

  1. Start bsdpserver.py
  2. Turn on laptop and try to netboot
  3. Choose a netboot image from the laptop and press enter
==> /var/log/bsdpserver.log <==
04/11/2018 01:20:49 PM - DEBUG:

-=- Starting new BSDP server session -=-

04/11/2018 01:20:49 PM - DEBUG: Considering NBI source at /nbi/netinstall-of-macos-sierra-not-configured.nbi
04/11/2018 01:20:49 PM - DEBUG: Considering NBI source at /nbi/netrestore-of-macos-sierra.nbi
04/11/2018 01:20:49 PM - DEBUG: [========= Using the following boot images =========]
04/11/2018 01:20:49 PM - DEBUG: /nbi/netinstall-of-macos-sierra-not-configured.nbi
04/11/2018 01:20:49 PM - DEBUG: /nbi/netrestore-of-macos-sierra.nbi
04/11/2018 01:20:49 PM - DEBUG: [=========     End boot image listing      =========]
04/11/2018 02:21:46 PM - DEBUG: -=========================================-
04/11/2018 02:21:46 PM - DEBUG: Got BSDP INFORM[LIST] packet:
04/11/2018 02:21:46 PM - DEBUG: Determining image list for system ID MacBookPro11,2
04/11/2018 02:21:46 PM - DEBUG: Found enabled system ID MacBookPro11,2 - adding "NetInstall of Unconfigured macOS 10.12.6 (16G29) Install (12.39 GB)." to list
04/11/2018 02:21:46 PM - DEBUG: Found enabled system ID MacBookPro11,2 - adding "NetRestore of macOS 10.12.6 (16G1114) Volume (18.69 GB)." to list
04/11/2018 02:21:46 PM - DEBUG: -=========================================-
04/11/2018 02:21:46 PM - DEBUG: Return ACK[LIST] to 10.5.1.34 on 68
04/11/2018 02:21:47 PM - DEBUG: -=========================================-

(...repeats until I select an image...)

04/11/2018 02:21:51 PM - DEBUG: -=========================================-
04/11/2018 02:21:51 PM - DEBUG: Got BSDP INFORM[SELECT] packet:
04/11/2018 02:21:51 PM - DEBUG: Determining image list for system ID MacBookPro11,2
04/11/2018 02:21:51 PM - DEBUG: Found enabled system ID MacBookPro11,2 - adding "NetInstall of Unconfigured macOS 10.12.6 (16G29) Install (12.39 GB)." to list
04/11/2018 02:21:51 PM - DEBUG: Found enabled system ID MacBookPro11,2 - adding "NetRestore of macOS 10.12.6 (16G1114) Volume (18.69 GB)." to list
04/11/2018 02:21:51 PM - DEBUG: *********
Refreshing basedmgpath because DOCKER_BSDPY_NBI_URL uses hostname, not IP
04/11/2018 02:21:51 PM - DEBUG: Resolving hostname to IP - netboot01 -> 10.5.0.30
04/11/2018 02:21:51 PM - DEBUG: Found DOCKER_BSDPY_NBI_URL - using basedmgpath http://10.5.0.30/nbi/
*********

04/11/2018 02:21:51 PM - DEBUG: -=========================================-
04/11/2018 02:21:51 PM - DEBUG: Return ACK[SELECT] to 10.5.1.34 on 68
04/11/2018 02:21:51 PM - DEBUG: --> TFTP path: /nbi/netrestore-of-macos-sierra.nbi/i386/booter
-->Boot image URI: http://10.5.0.30/nbi/netrestore-of-macos-sierra.nbi/NetInstall.dmg

And in syslog, I can see the tftpd server responding:

==> /var/log/syslog <==
Apr 11 14:21:52 netboot01 tftpd[550]: tftpd: trying to get file: /nbi/netrestore-of-macos-sierra.nbi/i386/booter
Apr 11 14:21:52 netboot01 tftpd[550]: tftpd: serving file from /
Apr 11 14:21:52 netboot01 tftpd[552]: tftpd: trying to get file: /nbi/netrestore-of-macos-sierra.nbi/i386/booter
Apr 11 14:21:52 netboot01 tftpd[552]: tftpd: serving file from /

I don't see any activity logged in the HTTP server.

I tested both URIs listed in the bsdpyserver.log output -- both work, both HTTP and TFTP.

Thanks!

bruienne commented 6 years ago

Hi there! I'd take a look at what a verbose Netboot on the client shows you when it actually tries to hit the TFTP service. The general flow is TFTP -> HTTP -> OS but it's easy to have something between server and client interfering even if the DHCP/BSDP phase was successful and the client gets the URLs for TFTP and HTTP phases.

The way I do a verbose Netboot is holding down option at reboot, holding down cmd-V and then clicking on the Netboot image you're testing. Things can scroll by pretty fast in verbose boot so pointing a phone to take video can be very helpful. If it's a VM you can also just screen cap the VM window and review that way. I imagine you might see some kind of TFTP or HTTP error that is not floated up in regular GUI boot. Let us know what you find this way.

ghost commented 6 years ago

This is a great idea, thanks.

Unfortunately, I don't see anything that seems like a problem. In fact, the verbose boot doesn't even seem to indicate that it even tries to use the Netboot image at all. It jumps right away to mentioning local devices during the boot.

What does verbose Netboot look like? i.e. what am I looking for?

Thanks!

ghost commented 6 years ago

Also, the only error is at the very end, and is something like

error caller not allowed to perform action launchctl operation not permitted system integrity enabled

(paraphrasing a bit here)

bruienne commented 6 years ago

If it doesn't even seem to hit the first phase of TFTP there's a good chance it's getting bounced by the TFTP server, or is unable to reach it somehow. I would see what logging the TFTP server has to offer to see why TFTP is immediately failing.