raspberrypi / firmware

This repository contains pre-compiled binaries of the current Raspberry Pi kernel and modules, userspace libraries, and bootloader/GPU firmware.
5.15k stars 1.68k forks source link

Memory reported incorrectly during PXE Boot on RPi 2B #1041

Closed michalsc closed 5 years ago

michalsc commented 5 years ago

On RaspberryPi 2B the memory detected and reported by VC (through ATAGS as well as through mbox interface) is wrong when device boots over network. VC Says the ARM memory starts at 0x00000000 and has a size of 0x08000000, reported VC memory starts at 0x08000000 and has either a size of 0x08000000.

When booted directly from SD card correct amount of memory is reported (ARM: base 0x00000000, size 0x3b400000, VC: base 0x3b400000, Size 04c00000).

Steps to reproduce:

  1. Put bootcode.bin on SD card, bootcode.bin, start.elf and fixup.dat as well as config.txt on tftp server.
  2. Boot any bare metal code reading ATAGS or querying memory through VC mbox. The memory reported will be wrong.
  3. Put the same files directly on SD card and boot raspberry. Correct memory amounts will be reported.

Please note I have tested this behaviour on RaspberryPi 2B only.

The firmware from "next" branch is not affected and operates properly in either boot mode.

pelwell commented 5 years ago

You will be left with a 128MB system if the fixup.dat file doesn't match the start.elf - the Pi must have a matched pair. I suggest using md5sum to confirm that the correct files are on the server, and check in the logs on the PXE server that they are being sent.

Assuming that the above doesn't lead to a working system, you can get some boot-time diagnostics from the UART (on header pins 6, 8 and 10) using a modified (and recent) bootcode.bin:

$ sudo sed -i -e "s/BOOT_UART=0/BOOT_UART=1/" /boot/bootcode.bin
michalsc commented 5 years ago

If fixup.dat would not match start.elf then the same behavior would be observed in case of booting from SD card directly. As I wrote, the very same set of files works when booted from SD card, but results in 128M ram when PXE booted over network.

I will provide bootlogs this evening, have plenty of them.

PS. All the files I was using (bootcode.bin, start.elf, fixup.dat) come from official github repository, hash 3221a3d, downloaded yesterday.

michalsc commented 5 years ago

Please find bootlogs attached. For clearance I'Ve removed entries regarding EDID (no monitor attached)

  1. Boot from SD card, Kernel says it does not support device tree
Raspberry Pi Bootcode

Found SD card, config.txt = 1, start.elf = 1, recovery.elf = 0, timeout = 0
Read File: config.txt, 67 (bytes)

Raspberry Pi Bootcode
Read File: config.txt, 67
Read File: start.elf, 2846852 (bytes)
Read File: fixup.dat, 6660 (bytes)
MESS:00:00:02.167670:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:02.171822:0: brfs: File read: 67 bytes
[...]
MESS:00:00:02.265215:0: brfs: File read: /mfs/sd/config.txt
[...]
MESS:00:00:02.294951:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
MESS:00:00:02.311171:0: gpioman: gpioman_get_pin_num: pin WL_LPO_CLK not defined
MESS:00:00:02.316871:0: gpioman: gpioman_get_pin_num: pin BT_ON not defined
MESS:00:00:02.323552:0: gpioman: gpioman_get_pin_num: pin WL_ON not defined
MESS:00:00:02.504637:0: gpioman: gpioman_get_pin_num: pin DISPLAY_DSI_PORT not defined
MESS:00:00:02.512086:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
[...]
MESS:00:00:02.729056:0: *** Restart logging
MESS:00:00:02.732305:0: brfs: File read: 67 bytes
MESS:00:00:02.737855:0: Failed to open command line file 'cmdline.txt'
MESS:00:00:02.787617:0: brfs: File read: /mfs/sd/aros-armeb-raspi.img
MESS:00:00:02.792360:0: Loading 'aros-armeb-raspi.img' to 0x8000 size 0xaba60
MESS:00:00:02.799232:0: brfs: File read: 703072 bytes
MESS:00:00:02.812434:0: brfs: File read: /mfs/sd/aros-armeb-bsp.rom
MESS:00:00:02.817008:0: Loading 'aros-armeb-bsp.rom' to 0xb3a60 size 0x1b460
MESS:00:00:02.840937:0: Kernel trailer DTOK property says no
MESS:00:00:02.844921:0: brfs: File read: 111712 bytes
MESS:00:00:02.853185:0: brfs: File read: /mfs/sd/bcm2709-rpi-2-b.dtb
MESS:00:00:02.857847:0: Loading 'bcm2709-rpi-2-b.dtb' to 0xb3a60 size 0x5e33
MESS:00:00:03.017983:0: brfs: File read: 24115 bytes
MESS:00:00:03.021956:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:04.096129:0: gpioman: gpioman_get_pin_num: pin EMMC_ENABLE not defined
MESS:00:00:04.211054:0: uart: Set PL011 baud rate to 103448.300000 Hz
MESS:00:00:04.217344:0: uart: Baud rate change done...
MESS:00:00:04.220776:0: uart: Baud rate change done...
MESS:00:00:04.226475:0: gpioman: gpioman_get_pin_num: pin SDCARD_CONTROL_POWER not defined
[BOOT] MMU map 40000000:400fffff->40000000:400fffff, b=0, c=0, ap=3, tex=0
[BOOT] MMU map 3f000000:3fffffff->3f000000:3fffffff, b=1, c=0, ap=3, tex=0
[BOOT] Big-Endian AROS Bootstrap/ARM BCM2708
[BOOT] Arguments: 00000000, 00000c42, 00000000, bff7bfb9
[BOOT] UART clock speed: 48000000
[BOOT] using 26.2 divisor for 115200 baud
[BOOT] control register init:00c50878, now:02c5187c
[BOOT] main id register: 410fc075
[BOOT] Parsing ATAGS
[BOOT]   54410001: ATAG_CORE - Ignored
[BOOT]   54410002: ATAG_MEM (00000000-3b3fffff)
[BOOT] MMU map 00000000:3b3fffff->00000000:3b3fffff, b=1, c=1, ap=3, tex=1
[BOOT]   54420005: ATAG_INITRD2 (00800000-0081b45f)
[BOOT]   54410009: ATAG_CMDLINE "bcm2708_fb.fbwidth=656 bcm2708_fb.fbheight=416 bcm2708_fb.fbswap=1 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000  console=ttyAMA0,115200 kgdboc=ttyAMA0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 rootwait"
[BOOT] Query VC memory
[BOOT] Base = 3b400000, Size = 04c00000
[BOOT] MMU map 3b400000:3fffffff->3b400000:3fffffff, b=1, c=0, ap=3, tex=0
  1. Boot from PXE, Exactly the same files as before. Kernel says it does not support device tree:
Raspberry Pi Bootcode

Found SD card, config.txt = 0, start.elf = 0, recovery.elf = 0, timeout = 0
Trying USB
Hub device found at addr 4, enumerating HUB
Initialise hub
Found 5 ports, multi_tt = 1
Setting interface 0
Enabling PORT POWER on port 1
Enabling PORT POWER on port 2
Enabling PORT POWER on port 3
Enabling PORT POWER on port 4
Enabling PORT POWER on port 5
Waiting for devices to respond to reset
Found device on port 1
Found highspeed device
Device found: type = Ethernet adapter, addr = 5
Trying booting from Ethernet device addr 5
Initialise ethernet with MAC b8:27:eb:85:c0:f2
Wait for Link up
Link up
Sending DHCP request
Waiting for dhcp_reply
Done ARP for 112.2.168.192 got 50:e5:49:c5:c9:24
Read File: config.txt, 67 (bytes)

Raspberry Pi Bootcode
Read File: config.txt, 67
Read File: start.elf, 2846852 (bytes)
MESS:00:00:08.545407:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:08.549443:0: brfs: File read: 67 bytes
[...]
MESS:00:00:08.642813:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:08.672548:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
MESS:00:00:08.688773:0: gpioman: gpioman_get_pin_num: pin WL_LPO_CLK not defined
MESS:00:00:08.694473:0: gpioman: gpioman_get_pin_num: pin BT_ON not defined
MESS:00:00:08.701152:0: gpioman: gpioman_get_pin_num: pin WL_ON not defined
MESS:00:00:08.732247:0: gpioman: gpioman_get_pin_num: pin DISPLAY_DSI_PORT not defined
MESS:00:00:08.739696:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
[...]
MESS:00:00:08.956666:0: *** Restart logging
MESS:00:00:08.959915:0: brfs: File read: 67 bytes
MESS:00:00:08.965807:0: Failed to open command line file 'cmdline.txt'
MESS:00:00:09.413875:0: brfs: File read: /mfs/sd/aros-armeb-raspi.img
MESS:00:00:09.418616:0: Loading 'aros-armeb-raspi.img' to 0x8000 size 0xaba60
MESS:00:00:09.425485:0: brfs: File read: 703072 bytes
MESS:00:00:09.498135:0: brfs: File read: /mfs/sd/aros-armeb-bsp.rom
MESS:00:00:09.502708:0: Loading 'aros-armeb-bsp.rom' to 0xb3a60 size 0x1b460
MESS:00:00:09.526640:0: Kernel trailer DTOK property says no
MESS:00:00:09.530624:0: brfs: File read: 111712 bytes
MESS:00:00:09.550641:0: brfs: File read: /mfs/sd/bcm2709-rpi-2-b.dtb
MESS:00:00:09.555301:0: Loading 'bcm2709-rpi-2-b.dtb' to 0xb3a60 size 0x5e33
MESS:00:00:09.715371:0: brfs: File read: 24115 bytes
MESS:00:00:09.719330:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:10.797191:0: gpioman: gpioman_get_pin_num: pin EMMC_ENABLE not defined
MESS:00:00:10.912241:0: uart: Set PL011 baud rate to 103448.300000 Hz
MESS:00:00:10.918529:0: uart: Baud rate change done...
MESS:00:00:10.921962:0: uart: Baud rate change done...
[BOOT] MMU map 40000000:400fffff->40000000:400fffff, b=0, c=0, ap=3, tex=0
[BOOT] MMU map 3f000000:3fffffff->3f000000:3fffffff, b=1, c=0, ap=3, tex=0
[BOOT] Big-Endian AROS Bootstrap/ARM BCM2708
[BOOT] Arguments: 00000000, 00000c42, 00000000, fbc77bd1
[BOOT] UART clock speed: 48000000
[BOOT] using 26.2 divisor for 115200 baud
[BOOT] control register init:00c50878, now:02c5187c
[BOOT] main id register: 410fc075
[BOOT] Parsing ATAGS
[BOOT]   54410001: ATAG_CORE - Ignored
[BOOT]   54410002: ATAG_MEM (00000000-07ffffff)
[BOOT] MMU map 00000000:07ffffff->00000000:07ffffff, b=1, c=1, ap=3, tex=1
[BOOT]   54420005: ATAG_INITRD2 (00800000-0081b45f)
[BOOT]   54410009: ATAG_CMDLINE "bcm2708_fb.fbwidth=656 bcm2708_fb.fbheight=416 bcm2708_fb.fbswap=1 vc_mem.mem_base=0xec00000 vc_mem.mem_size=0x10000000  console=ttyAMA0,115200 kgdboc=ttyAMA0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 rootwait"
[BOOT] Query VC memory
[BOOT] Base = 08000000, Size = 08000000
[BOOT] MMU map 08000000:0fffffff->08000000:0fffffff, b=1, c=0, ap=3, tex=0
  1. Booting from SD card, the very same files, kernel says it does support device tree:
Raspberry Pi Bootcode

Found SD card, config.txt = 1, start.elf = 1, recovery.elf = 0, timeout = 0
Read File: config.txt, 67 (bytes)

Raspberry Pi Bootcode
Read File: config.txt, 67
Read File: start.elf, 2846852 (bytes)
Read File: fixup.dat, 6660 (bytes)
MESS:00:00:02.168557:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:02.172709:0: brfs: File read: 67 bytes
[...]
MESS:00:00:02.266111:0: brfs: File read: /mfs/sd/config.txt
[...]
MESS:00:00:02.295846:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
MESS:00:00:02.312067:0: gpioman: gpioman_get_pin_num: pin WL_LPO_CLK not defined
MESS:00:00:02.317768:0: gpioman: gpioman_get_pin_num: pin BT_ON not defined
MESS:00:00:02.324448:0: gpioman: gpioman_get_pin_num: pin WL_ON not defined
MESS:00:00:02.505534:0: gpioman: gpioman_get_pin_num: pin DISPLAY_DSI_PORT not defined
MESS:00:00:02.512983:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
[...]
MESS:00:00:02.729954:0: *** Restart logging
MESS:00:00:02.733202:0: brfs: File read: 67 bytes
MESS:00:00:02.738751:0: Failed to open command line file 'cmdline.txt'
MESS:00:00:02.788520:0: brfs: File read: /mfs/sd/aros-armeb-raspi.img
MESS:00:00:02.793264:0: Loading 'aros-armeb-raspi.img' to 0x8000 size 0xaba60
MESS:00:00:02.800136:0: brfs: File read: 703072 bytes
MESS:00:00:02.814262:0: brfs: File read: /mfs/sd/aros-armeb-bsp.rom
MESS:00:00:02.818836:0: Loading 'aros-armeb-bsp.rom' to 0xb3a60 size 0x1b460
MESS:00:00:02.842765:0: Kernel trailer DTOK property says yes
MESS:00:00:02.846836:0: brfs: File read: 111712 bytes
MESS:00:00:02.855103:0: brfs: File read: /mfs/sd/bcm2709-rpi-2-b.dtb
MESS:00:00:02.859758:0: Loading 'bcm2709-rpi-2-b.dtb' to 0xb3a60 size 0x5e33
MESS:00:00:03.019784:0: brfs: File read: 24115 bytes
MESS:00:00:03.023753:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:04.101701:0: gpioman: gpioman_get_pin_num: pin EMMC_ENABLE not defined
MESS:00:00:04.218895:0: Device tree loaded to 0x2eff9d00 (size 0x622a)
MESS:00:00:04.224537:0: uart: Set PL011 baud rate to 103448.300000 Hz
MESS:00:00:04.231435:0: uart: Baud rate change done...
MESS:00:00:04.234864:0: uart: Baud rate change done...
MESS:00:00:04.240560:0: gpioman: gpioman_get_pin_num: pin SDCARD_CONTROL_POWER not defined
[BOOT] MMU map 40000000:400fffff->40000000:400fffff, b=0, c=0, ap=3, tex=0
[BOOT] MMU map 3f000000:3fffffff->3f000000:3fffffff, b=1, c=0, ap=3, tex=0
[BOOT] Big-Endian AROS Bootstrap/ARM BCM2708
[BOOT] Arguments: 00000000, 00000c42, 2eff9d00, fbc77bd1
[BOOT] UART clock speed: 48000000
[BOOT] using 26.2 divisor for 115200 baud
[BOOT] control register init:00c50878, now:02c5187c
[BOOT] main id register: 410fc075
[BOOT] Parsing ATAGS
[BOOT]   2a620000: (UNKNOWN)...
[BOOT] Query VC memory
[BOOT] Base = 3b400000, Size = 04c00000
[BOOT] MMU map 3b400000:3fffffff->3b400000:3fffffff, b=1, c=0, ap=3, tex=0
  1. Same as above, same files, PXE boot with device tree enabled. Please not how the "Device tree loaded" changes, probably suggesting an issue in start.elf
Raspberry Pi Bootcode

Found SD card, config.txt = 0, start.elf = 0, recovery.elf = 0, timeout = 0
Trying USB
Hub device found at addr 4, enumerating HUB
Initialise hub
Found 5 ports, multi_tt = 1
Setting interface 0
Enabling PORT POWER on port 1
Enabling PORT POWER on port 2
Enabling PORT POWER on port 3
Enabling PORT POWER on port 4
Enabling PORT POWER on port 5
Waiting for devices to respond to reset
Found device on port 1
Found highspeed device
Device found: type = Ethernet adapter, addr = 5
Trying booting from Ethernet device addr 5
Initialise ethernet with MAC b8:27:eb:85:c0:f2
Wait for Link up
Link up
Sending DHCP request
Waiting for dhcp_reply
Done ARP for 112.2.168.192 got 50:e5:49:c5:c9:24
Read File: config.txt, 67 (bytes)

Raspberry Pi Bootcode
Read File: config.txt, 67
Read File: start.elf, 2846852 (bytes)
MESS:00:00:07.869701:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:07.873738:0: brfs: File read: 67 bytes
[...]
MESS:00:00:07.967233:0: brfs: File read: /mfs/sd/config.txt
[...]
MESS:00:00:07.996967:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
MESS:00:00:08.013187:0: gpioman: gpioman_get_pin_num: pin WL_LPO_CLK not defined
MESS:00:00:08.018888:0: gpioman: gpioman_get_pin_num: pin BT_ON not defined
MESS:00:00:08.025567:0: gpioman: gpioman_get_pin_num: pin WL_ON not defined
MESS:00:00:08.056662:0: gpioman: gpioman_get_pin_num: pin DISPLAY_DSI_PORT not defined
MESS:00:00:08.064111:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined
[...]
MESS:00:00:08.281081:0: *** Restart logging
MESS:00:00:08.284331:0: brfs: File read: 67 bytes
MESS:00:00:08.290226:0: Failed to open command line file 'cmdline.txt'
MESS:00:00:08.748796:0: brfs: File read: /mfs/sd/aros-armeb-raspi.img
MESS:00:00:08.753538:0: Loading 'aros-armeb-raspi.img' to 0x8000 size 0xaba60
MESS:00:00:08.760406:0: brfs: File read: 703072 bytes
MESS:00:00:08.833931:0: brfs: File read: /mfs/sd/aros-armeb-bsp.rom
MESS:00:00:08.838504:0: Loading 'aros-armeb-bsp.rom' to 0xb3a60 size 0x1b460
MESS:00:00:08.862445:0: Kernel trailer DTOK property says yes
MESS:00:00:08.866517:0: brfs: File read: 111712 bytes
MESS:00:00:08.886061:0: brfs: File read: /mfs/sd/bcm2709-rpi-2-b.dtb
MESS:00:00:08.890725:0: Loading 'bcm2709-rpi-2-b.dtb' to 0xb3a60 size 0x5e33
MESS:00:00:09.050754:0: brfs: File read: 24115 bytes
MESS:00:00:09.054751:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:10.135150:0: gpioman: gpioman_get_pin_num: pin EMMC_ENABLE not defined
MESS:00:00:10.252304:0: Device tree loaded to 0x7fe9d00 (size 0x6226)
MESS:00:00:10.257857:0: uart: Set PL011 baud rate to 103448.300000 Hz
MESS:00:00:10.264754:0: uart: Baud rate change done...
MESS:00:00:10.268186:0: uart: Baud rate change done...
[BOOT] MMU map 40000000:400fffff->40000000:400fffff, b=0, c=0, ap=3, tex=0
[BOOT] MMU map 3f000000:3fffffff->3f000000:3fffffff, b=1, c=0, ap=3, tex=0
[BOOT] Big-Endian AROS Bootstrap/ARM BCM2708
[BOOT] Arguments: 00000000, 00000c42, 07fe9d00, 8cfdede7
[BOOT] UART clock speed: 48000000
[BOOT] using 26.2 divisor for 115200 baud
[BOOT] control register init:00c50878, now:02c5187c
[BOOT] main id register: 410fc075
[BOOT] Parsing ATAGS
[BOOT]   26620000: (UNKNOWN)...
[BOOT] Query VC memory
[BOOT] Base = 08000000, Size = 08000000
[BOOT] MMU map 08000000:0fffffff->08000000:0fffffff, b=1, c=0, ap=3, tex=0

All four test cases done with freshly loaded files from main repository, same hardware, same files.

pelwell commented 5 years ago

Interesting. Notice that in the two boots from SD card the log includes the line:

Read File: fixup.dat, 6660 (bytes)

whereas in the failing PXE boot cases it is absent. This is the immediate cause of the missing memory, but we still need to determine the root cause.

Before I dive into the source of bootcode.bin, can you get logs from your PXE DHCP/TFTP server showing the transactions in the success and failure cases? Just one of the pairs, e.g. DTOK=yes, would be sufficient.

michalsc commented 5 years ago

I need to adjust settings of tftp server, gimme an hour (I'm afk now). But in the mean time I have changed start.elf and fixup.dat to versions from "next" branch, leaving bootcode.bin unchanged. And there it comes:

Raspberry Pi Bootcode Read File: config.txt, 67 Read File: start.elf, 2821476 (bytes) Read File: fixup.dat, 6567 (bytes)

So, it must be something in start.elf right? How else the same bootcode.bin would fail in one PXE boot case (with up to date start.elf and fixup.dat) and success in another PXE boot case (with start and fixup from older branch)?

So, as promised, logs from tftp in about one hour...

michalsc commented 5 years ago

Here's the log from tftp server:

Current version of bootcode.bin start.elf and fixup.dat:

Sep 20 20:00:13 dnsmasq-dhcp[21076]: 653460281 available DHCP subnet: 192.168.2.255/255.255.255.0 Sep 20 20:00:13 dnsmasq-dhcp[21076]: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001 Sep 20 20:00:13 dnsmasq-dhcp[21076]: 653460281 PXE(en0) b8:27:eb:85:c0:f2 proxy Sep 20 20:00:13 dnsmasq-dhcp[21076]: 653460281 tags: en0 Sep 20 20:00:13 dnsmasq-dhcp[21076]: 653460281 broadcast response Sep 20 20:00:13 dnsmasq-dhcp[21076]: 653460281 sent size: 1 option: 53 message-type 2 Sep 20 20:00:13 dnsmasq-dhcp[21076]: 653460281 sent size: 4 option: 54 server-identifier 192.168.2.112 Sep 20 20:00:13 dnsmasq-dhcp[21076]: 653460281 sent size: 9 option: 60 vendor-class 50:58:45:43:6c:69:65:6e:74 Sep 20 20:00:13 dnsmasq-dhcp[21076]: 653460281 sent size: 17 option: 97 client-machine-id 00:f2:c0:85:72:f2:c0:85:72:f2:c0:85:72:f2... Sep 20 20:00:13 dnsmasq-dhcp[21076]: 653460281 sent size: 32 option: 43 vendor-encap 06:01:03:0a:04:00:50:58:45:09:14:00:00:11... Sep 20 20:00:13 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/7285c0f2/start.elf not found Sep 20 20:00:13 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/autoboot.txt not found Sep 20 20:00:13 dnsmasq-tftp[21076]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/config.txt to 192.168.2.126 Sep 20 20:00:13 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/recovery.elf not found Sep 20 20:00:15 dnsmasq-tftp[21076]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/start.elf to 192.168.2.126 Sep 20 20:00:16 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/recovery.elf not found Sep 20 20:00:16 dnsmasq-tftp[21076]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/config.txt to 192.168.2.126 Sep 20 20:00:16 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/dt-blob.bin not found Sep 20 20:00:16 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/recovery.elf not found Sep 20 20:00:16 dnsmasq-tftp[21076]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/config.txt to 192.168.2.126 Sep 20 20:00:16 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/bootcfg.txt not found Sep 20 20:00:16 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/cmdline.txt not found Sep 20 20:00:16 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/armstub7.bin not found

Current version of bootcode.bin, but start.elf and fixup.dat from "next" branch:

Sep 20 20:03:04 dnsmasq-dhcp[21076]: 653460281 available DHCP subnet: 192.168.2.255/255.255.255.0 Sep 20 20:03:04 dnsmasq-dhcp[21076]: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001 Sep 20 20:03:04 dnsmasq-dhcp[21076]: 653460281 PXE(en0) b8:27:eb:85:c0:f2 proxy Sep 20 20:03:04 dnsmasq-dhcp[21076]: 653460281 tags: en0 Sep 20 20:03:04 dnsmasq-dhcp[21076]: 653460281 broadcast response Sep 20 20:03:04 dnsmasq-dhcp[21076]: 653460281 sent size: 1 option: 53 message-type 2 Sep 20 20:03:04 dnsmasq-dhcp[21076]: 653460281 sent size: 4 option: 54 server-identifier 192.168.2.112 Sep 20 20:03:04 dnsmasq-dhcp[21076]: 653460281 sent size: 9 option: 60 vendor-class 50:58:45:43:6c:69:65:6e:74 Sep 20 20:03:04 dnsmasq-dhcp[21076]: 653460281 sent size: 17 option: 97 client-machine-id 00:f2:c0:85:72:f2:c0:85:72:f2:c0:85:72:f2... Sep 20 20:03:04 dnsmasq-dhcp[21076]: 653460281 sent size: 32 option: 43 vendor-encap 06:01:03:0a:04:00:50:58:45:09:14:00:00:11... Sep 20 20:03:04 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/7285c0f2/start.elf not found Sep 20 20:03:04 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/autoboot.txt not found Sep 20 20:03:04 dnsmasq-tftp[21076]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/config.txt to 192.168.2.126 Sep 20 20:03:04 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/recovery.elf not found Sep 20 20:03:06 dnsmasq-tftp[21076]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/start.elf to 192.168.2.126 Sep 20 20:03:06 dnsmasq-tftp[21076]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/fixup.dat to 192.168.2.126 Sep 20 20:03:06 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/recovery.elf not found Sep 20 20:03:06 dnsmasq-tftp[21076]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/config.txt to 192.168.2.126 Sep 20 20:03:06 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/dt-blob.bin not found Sep 20 20:03:06 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/recovery.elf not found Sep 20 20:03:06 dnsmasq-tftp[21076]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/config.txt to 192.168.2.126 Sep 20 20:03:06 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/bootcfg.txt not found Sep 20 20:03:06 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/cmdline.txt not found Sep 20 20:03:06 dnsmasq-tftp[21076]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/armstub7.bin not found

As you may see, in current start.elf the fixup.dat is not even asked for...

pelwell commented 5 years ago

That doesn't make much sense (yet), because bootcode.bin loads both start.elf and fixup.dat (which is essentially relocation and resizing information which must be applied before execution passes to start.elf).

Down the rabbit hole it is, then...

michalsc commented 5 years ago

Let me know if I can help somehow apart from testing.

michalsc commented 5 years ago

Another observation. The same behaviour (current start.elf loads, fixup.dat does not load) occurs with PXE boot of both versions of bootcode.bin, the current one as well as the one from "next" branch. SO it really looks as if it would be related to start.elf only.

pelwell commented 5 years ago

From an initial read of the high level boot flow it is not possible to get to executing "start.elf" without first trying (to some extent) to load "fixup.dat", so that attempt must be failing. Deeper we go.

michalsc commented 5 years ago

Maybe some garbage in return value used as conditional further in the code? Something like: if (loadElfFile("start.elf")) { loadFile("fixup.dat"); }, where for some reason the return value is interpreted differently in case of the newest start.elf? Just guessing, can't tell anything without the code ;)

PS. I've ordered raspberry pi 3b+ and will test the PXE boot there (without bootcode.bin), either today or tomorrow.

pelwell commented 5 years ago

Although the name of the fixup file is conditional, trying to load it isn't.

I can't reproduce the problem here. This is a Pi 2B network booting the latest firmware:

Raspberry Pi Bootcode

Found SD card, config.txt = 0, start.elf = 0, recovery.elf = 0, timeout = 0
Trying USB
Hub device found at addr 4, enumerating HUB
Initialise hub
Found 5 ports, multi_tt = 1
Setting interface 0
Enabling PORT POWER on port 1
Enabling PORT POWER on port 2
Enabling PORT POWER on port 3
Enabling PORT POWER on port 4
Enabling PORT POWER on port 5
Waiting for devices to respond to reset
Found device on port 1
Found highspeed device
Device found: type = Ethernet adapter, addr = 5
Trying booting from Ethernet device addr 5
Initialise ethernet with MAC b8:27:eb:02:91:6c
Wait for Link up
Link up
Sending DHCP request
Waiting for dhcp_reply
Done ARP for 18.14.3.10 got 8a:5c:73:69:f3:58
Read File: config.txt, 30 (bytes)

Raspberry Pi Bootcode
Read File: config.txt, 30
Read File: start.elf, 2846852 (bytes)
Read File: fixup.dat, 6660 (bytes)
MESS:00:00:07.253196:0: brfs: File read: /mfs/sd/config.txt
MESS:00:00:07.257209:0: brfs: File read: 30 bytes
MESS:00:00:07.282940:0: HDMI:EDID version 1.3, 1 extensions, screen size 53x30 cm
MESS:00:00:07.288769:0: HDMI:EDID features - videodef 0x80 !standby !suspend active off; colour encoding:RGB...

What is in your config.txt?

michalsc commented 5 years ago

This is my config.txt:

kernel=aros-armeb-raspi.img initramfs aros-armeb-bsp.rom 0x00800000

pelwell commented 5 years ago

Those options are ignored by bootcode.bin, so it's effectively empty.

Can you run the following commands? You'll have to amend the paths (and where you run the commands) for your system, since you aren't running Raspbian. I've included the responses from my system (with the SD card automatically mounted to /media/pi/boot and the TFTP directory for this Pi mounted on /boot) with the latest firmware:

pi@raspberrypi:~ $ md5sum /media/pi/boot/bootcode.bin 
608fb17e087cc7c2866d02cf26f4dba5  /media/pi/boot/bootcode.bin
pi@raspberrypi:~ $ md5sum /boot/start.elf
924bef72224dd8590c9ce372f4444fe6  /boot/start.elf
pi@raspberrypi:~ $ md5sum /boot/fixup.dat 
715117d0e44ee1c659d85f95a5028e1a  /boot/fixup.dat
michalsc commented 5 years ago

I will when I get home where the system is (in about 6 hours). In the mean time I've checksumed the files on my local build (the same build system, theoretically the very same files):

bash-3.2$ md5 bootcode.bin fixup.dat start.elf MD5 (bootcode.bin) = 608fb17e087cc7c2866d02cf26f4dba5 MD5 (fixup.dat) = 715117d0e44ee1c659d85f95a5028e1a MD5 (start.elf) = 924bef72224dd8590c9ce372f4444fe6

the md5 for bootcode.bin is with BOOT_UART=1. In 6 hours md5 sum of the files as they are on sdcard and/or tftp server...

PS. Could it be that memory reservation for my initramfs corrupts somethig? or is it irrelevant?

pelwell commented 5 years ago

bootcode.bin only cares about start.elf, recovery.elf, fixup.dat, config.txt and autoboot.txt, and a very small subset of config.txt settings (which doesn't include kernel= or initramfs) - everything else is handled by start*.elf.

michalsc commented 5 years ago

I've just checked the files on tftp server. They do have exactly the same md5 sums as the ones you have. Can it be my raspberry is broken?

pelwell commented 5 years ago

It seems improbable that there would be hardware fault that would only show up with a few images, and to do so as consistently as this.

I'll put together a new bootcode with extended diagnostic checks and report back with a download link.

michalsc commented 5 years ago

Another observation: From time to time rpi 2b can load more or less files from tftp server. Once it managed to load fixup.dat and start.elf (the new ones) but failed to load config.txt and thus started to blink the green led constantly. After another reset cycle the old behaviour came back - the start.elf and config.txt are loaded, but fixup.dat not.

In all doubt I have also tried another tftp server. Until now I was using dnsmasq, now for test purposes I've switched to apples tftp server. Here, start.elf and fixup.dat were loaded only once, after another try (and now actually all the time) only start.elf gets loaded. Neither fixup.dat nor config.txt are received.

So, maybe there is some issue with tftp implementation in bootcode.bin? Or maybe some timing issue?

michalsc commented 5 years ago

I have just received raspberry3b+ and have similar issue. I'm now really sure it has something to do with the timing, as I do not have any other issues with my local network:

serial port log from raspberry (again, with hdmi logs removed):

Raspberry Pi Bootcode

USB ethernet boot Done ARP for 192.168.2.112 got 50:e5:49:c5:c9:24 Read File: config.txt, 70 (bytes)

Raspberry Pi Bootcode Read File: config.txt, 70 Read File: start.elf, 2847044 (bytes) MESS:00:00:14.938592:0: brfs: File read: /mfs/sd/config.txt MESS:00:00:14.942646:0: brfs: File read: 70 bytes MESS:00:00:14.956659:0: HDMI:EDID error reading EDID block 0 attempt 0 [...] MESS:00:00:15.036874:0: brfs: File read: /mfs/sd/config.txt MESS:00:00:15.040903:0: HDMI:Setting property pixel encoding to Default [...] MESS:00:00:15.285242:0: gpioman: gpioman_get_pin_num: pin DISPLAY_DSI_PORT not defined MESS:00:00:15.503288:0: *** Restart logging MESS:00:00:15.505777:0: brfs: File read: 70 bytes MESS:00:00:15.511259:0: Failed to open command line file 'cmdline.txt' MESS:00:00:15.863692:0: brfs: File read: /mfs/sd/aros-armeb-raspi.img MESS:00:00:15.868434:0: Loading 'aros-armeb-raspi.img' to 0x8000 size 0xaba60 MESS:00:00:15.875303:0: brfs: File read: 703072 bytes MESS:00:00:15.935201:0: brfs: File read: /mfs/sd/aros-armeb-bsp.rom MESS:00:00:15.939773:0: Loading 'aros-armeb-bsp.rom' to 0xb3a60 size 0x1b460 MESS:00:00:15.971828:0: Kernel trailer DTOK property says no MESS:00:00:15.975813:0: brfs: File read: 111712 bytes MESS:00:00:15.993978:0: brfs: File read: /mfs/sd/bcm2710-rpi-3-b-plus.dtb MESS:00:00:15.999073:0: Loading 'bcm2710-rpi-3-b-plus.dtb' to 0xb3a60 size 0x63e6 MESS:00:00:16.154085:0: brfs: File read: 25574 bytes MESS:00:00:16.158022:0: brfs: File read: /mfs/sd/config.txt MESS:00:00:17.192569:0: gpioman: gpioman_get_pin_num: pin EMMC_ENABLE not defined MESS:00:00:17.320115:0: uart: Set PL011 baud rate to 103448.300000 Hz MESS:00:00:17.326407:0: uart: Baud rate change done... MESS:00:00:17.329840:0: uart: Baud rate change done...

Interesting is that raspberry said it is loading my kernel files, whereas tftp says something different now (please note the "failed sending" messages):

Sep 22 12:12:48 dnsmasq-dhcp[39969]: 653460281 available DHCP subnet: 192.168.2.255/255.255.255.0 Sep 22 12:12:48 dnsmasq-dhcp[39969]: 653460281 vendor class: PXEClient:Arch:00000:UNDI:002001 Sep 22 12:12:48 dnsmasq-dhcp[39969]: 653460281 PXE(en0) b8:27:eb:ff:da:08 proxy Sep 22 12:12:48 dnsmasq-dhcp[39969]: 653460281 tags: en0 Sep 22 12:12:48 dnsmasq-dhcp[39969]: 653460281 broadcast response Sep 22 12:12:48 dnsmasq-dhcp[39969]: 653460281 sent size: 1 option: 53 message-type 2 Sep 22 12:12:48 dnsmasq-dhcp[39969]: 653460281 sent size: 4 option: 54 server-identifier 192.168.2.112 Sep 22 12:12:48 dnsmasq-dhcp[39969]: 653460281 sent size: 9 option: 60 vendor-class 50:58:45:43:6c:69:65:6e:74 Sep 22 12:12:48 dnsmasq-dhcp[39969]: 653460281 sent size: 17 option: 97 client-machine-id 00:08:da:ff:e1:08:da:ff:e1:08:da:ff:e1:08... Sep 22 12:12:48 dnsmasq-dhcp[39969]: 653460281 sent size: 32 option: 43 vendor-encap 06:01:03:0a:04:00:50:58:45:09:14:00:00:11... Sep 22 12:12:48 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/bootsig.bin not found Sep 22 12:12:48 dnsmasq-tftp[39969]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/bootcode.bin to 192.168.2.127 Sep 22 12:12:48 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/e1ffda08/start.elf not found Sep 22 12:12:48 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/autoboot.txt not found Sep 22 12:12:48 dnsmasq-tftp[39969]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/config.txt to 192.168.2.127 Sep 22 12:12:48 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/recovery.elf not found Sep 22 12:12:49 dnsmasq-tftp[39969]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/start.elf to 192.168.2.127 Sep 22 12:12:50 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/recovery.elf not found Sep 22 12:12:50 dnsmasq-tftp[39969]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/config.txt to 192.168.2.127 Sep 22 12:12:50 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/dt-blob.bin not found Sep 22 12:12:50 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/recovery.elf not found Sep 22 12:12:50 dnsmasq-tftp[39969]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/config.txt to 192.168.2.127 Sep 22 12:12:51 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/bootcfg.txt not found Sep 22 12:12:51 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/cmdline.txt not found Sep 22 12:12:51 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/armstub8.bin not found Sep 22 12:12:51 dnsmasq-tftp[39969]: error 0 Early terminate received from 192.168.2.127 Sep 22 12:12:51 dnsmasq-tftp[39969]: failed sending /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/aros-armeb-raspi.img to 192.168.2.127 Sep 22 12:12:51 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/armstub8-32.bin not found Sep 22 12:12:51 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/armstub7.bin not found Sep 22 12:12:51 dnsmasq-tftp[39969]: file /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/armstub.bin not found Sep 22 12:12:51 dnsmasq-tftp[39969]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/aros-armeb-raspi.img to 192.168.2.127 Sep 22 12:12:51 dnsmasq-tftp[39969]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/aros-armeb-bsp.rom to 192.168.2.127 Sep 22 12:12:51 dnsmasq-tftp[39969]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/bcm2710-rpi-3-b-plus.dtb to 192.168.2.127 Sep 22 12:12:52 dnsmasq-tftp[39969]: sent /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/config.txt to 192.168.2.127 Sep 22 12:13:08 dnsmasq-tftp[39969]: failed sending /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/fixup.dat to 192.168.2.127

The files are definitely there. I have just connected with another tftp client over network, set binary mode and fetched all the files. I have also verified with md5 that the transferred files are exactly the same as on tftp server...

pelwell commented 5 years ago

Can you install tcpdump on the server and sniff the traffic during booting:

$ sudo tcpdump -i eth0 port bootpc

(You may need to change the name of the interface from eth0 to whatever is appropriate.)

I'm hoping this will give answers faster than a special diagnostic build, or at least it will guide me as to where to insert the extra logging.

michalsc commented 5 years ago

tcpdump for raspi3b+:

tcpdump -i en0 -vv port bootpc tcpdump: listening on en0, link-type EN10MB (Ethernet), capture size 262144 bytes 22:19:20.515451 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 350) 0.0.0.0.bootpc > broadcasthost.bootps: [no cksum] BOOTP/DHCP, Request from b8:27:eb:ff:da:08 (oui Unknown), length 322, xid 0x26f30339, Flags [none] (0x0000) Client-Ethernet-Address b8:27:eb:ff:da:08 (oui Unknown) Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover Parameter-Request Option 55, length 14: Vendor-Option, Vendor-Class, BF, Option 128 Option 129, Option 130, Option 131, Option 132 Option 133, Option 134, Option 135, TFTP Subnet-Mask, Default-Gateway ARCH Option 93, length 2: 0 NDI Option 94, length 3: 1.2.1 GUID Option 97, length 17: 0.8.218.255.225.8.218.255.225.8.218.255.225.8.218.255.225 Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001" 22:19:20.516176 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 342) michals-imac-2.bootps > broadcasthost.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 314, xid 0x26f30339, Flags [Broadcast] (0x8000) Client-Ethernet-Address b8:27:eb:ff:da:08 (oui Unknown) Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: michals-imac-2 Vendor-Class Option 60, length 9: "PXEClient" GUID Option 97, length 17: 0.8.218.255.225.8.218.255.225.8.218.255.225.8.218.255.225 Vendor-Option Option 43, length 32: 6.1.3.10.4.0.80.88.69.9.20.0.0.17.82.97.115.112.98.101.114.114.121.32.80.105.32.66.111.111.116.255

tcpdump for raspi2b:

tcpdump -i en0 -vv port bootpc tcpdump: listening on en0, link-type EN10MB (Ethernet), capture size 262144 bytes 22:20:58.080937 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 350) 0.0.0.0.bootpc > broadcasthost.bootps: [no cksum] BOOTP/DHCP, Request from b8:27:eb:85:c0:f2 (oui Unknown), length 322, xid 0x26f30339, Flags [none] (0x0000) Client-Ethernet-Address b8:27:eb:85:c0:f2 (oui Unknown) Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover Parameter-Request Option 55, length 14: Vendor-Option, Vendor-Class, BF, Option 128 Option 129, Option 130, Option 131, Option 132 Option 133, Option 134, Option 135, TFTP Subnet-Mask, Default-Gateway ARCH Option 93, length 2: 0 NDI Option 94, length 3: 1.2.1 GUID Option 97, length 17: 0.242.192.133.114.242.192.133.114.242.192.133.114.242.192.133.114 Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001" 22:20:58.083643 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 342) michals-imac-2.bootps > broadcasthost.bootpc: [udp sum ok] BOOTP/DHCP, Reply, length 314, xid 0x26f30339, Flags [Broadcast] (0x8000) Client-Ethernet-Address b8:27:eb:85:c0:f2 (oui Unknown) Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: michals-imac-2 Vendor-Class Option 60, length 9: "PXEClient" GUID Option 97, length 17: 0.242.192.133.114.242.192.133.114.242.192.133.114.242.192.133.114 Vendor-Option Option 43, length 32: 6.1.3.10.4.0.80.88.69.9.20.0.0.17.82.97.115.112.98.101.114.114.121.32.80.105.32.66.111.111.116.255

I have cloned the firmware from github to my local machine. Tomorrow I will try to find the commit which is the first one where fixup.dat is not loaded on my machine. Will let you know asap.

pelwell commented 5 years ago

Sorry, can you do that again with port tftp instead of bootpc ?

michalsc commented 5 years ago

some tcpdump data for raspi 3b+:

actual start.elf (not loading):

sh-3.2# tcpdump -i en0 -vvv port tftp tcpdump: listening on en0, link-type EN10MB (Ethernet), capture size 262144 bytes 09:50:55.458992 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 49) 192.168.2.127.49153 > michals-imac-2.tftp: [no cksum] 21 RRQ "bootcode.bin" octet 09:50:55.495644 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 48) 192.168.2.127.49154 > michals-imac-2.tftp: [no cksum] 20 RRQ "bootsig.bin" octet 09:50:55.606256 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 63) 192.168.2.127.49153 > michals-imac-2.tftp: [no cksum] 35 RRQ "e1ffda08/start.elf" octet tsize 0 09:50:55.612055 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49154 > michals-imac-2.tftp: [no cksum] 29 RRQ "autoboot.txt" octet tsize 0 09:50:55.612495 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 55) 192.168.2.127.49155 > michals-imac-2.tftp: [no cksum] 27 RRQ "config.txt" octet tsize 0 09:50:55.619844 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49156 > michals-imac-2.tftp: [no cksum] 29 RRQ "recovery.elf" octet tsize 0 09:50:55.620248 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 54) 192.168.2.127.49157 > michals-imac-2.tftp: [no cksum] 26 RRQ "start.elf" octet tsize 0 09:50:57.055484 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 54) 192.168.2.127.49158 > michals-imac-2.tftp: [no cksum] 26 RRQ "fixup.dat" octet tsize 0 09:50:58.197067 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49153 > michals-imac-2.tftp: [no cksum] 29 RRQ "recovery.elf" octet tsize 0 09:50:58.197458 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 55) 192.168.2.127.49154 > michals-imac-2.tftp: [no cksum] 27 RRQ "config.txt" octet tsize 0 09:50:58.206404 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 56) 192.168.2.127.49155 > michals-imac-2.tftp: [no cksum] 28 RRQ "dt-blob.bin" octet tsize 0 09:50:58.295396 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49156 > michals-imac-2.tftp: [no cksum] 29 RRQ "recovery.elf" octet tsize 0 09:50:58.295845 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 55) 192.168.2.127.49157 > michals-imac-2.tftp: [no cksum] 27 RRQ "config.txt" octet tsize 0 09:50:58.769727 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 56) 192.168.2.127.49158 > michals-imac-2.tftp: [no cksum] 28 RRQ "bootcfg.txt" octet tsize 0 09:50:58.770492 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 56) 192.168.2.127.49159 > michals-imac-2.tftp: [no cksum] 28 RRQ "cmdline.txt" octet tsize 0 09:50:58.776206 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 65) 192.168.2.127.49160 > michals-imac-2.tftp: [no cksum] 37 RRQ "aros-armeb-raspi.img" octet tsize 0 09:50:58.776921 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49161 > michals-imac-2.tftp: [no cksum] 29 RRQ "armstub8.bin" octet tsize 0 09:50:58.777273 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 60) 192.168.2.127.49162 > michals-imac-2.tftp: [no cksum] 32 RRQ "armstub8-32.bin" octet tsize 0 09:50:58.777520 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49163 > michals-imac-2.tftp: [no cksum] 29 RRQ "armstub7.bin" octet tsize 0 09:50:58.777898 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 56) 192.168.2.127.49164 > michals-imac-2.tftp: [no cksum] 28 RRQ "armstub.bin" octet tsize 0 09:50:58.778299 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 65) 192.168.2.127.49165 > michals-imac-2.tftp: [no cksum] 37 RRQ "aros-armeb-raspi.img" octet tsize 0 09:50:59.139945 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 63) 192.168.2.127.49166 > michals-imac-2.tftp: [no cksum] 35 RRQ "aros-armeb-bsp.rom" octet tsize 0 09:50:59.240578 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 69) 192.168.2.127.49167 > michals-imac-2.tftp: [no cksum] 41 RRQ "bcm2710-rpi-3-b-plus.dtb" octet tsize 0 09:50:59.416636 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 55) 192.168.2.127.49168 > michals-imac-2.tftp: [no cksum] 27 RRQ "config.txt" octet tsize 0

additionally:

sh-3.2# tcpdump -i en0 -vvv port 49158 tcpdump: listening on en0, link-type EN10MB (Ethernet), capture size 262144 bytes 09:51:16.285518 IP (tos 0x0, ttl 64, id 27548, offset 0, flags [none], proto UDP (17), length 41, bad cksum 0 (->88e8)!) michals-imac-2.58276 > 192.168.2.127.49158: [bad udp cksum 0x8666 -> 0x267e!] UDP, length 13 09:51:30.367914 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 54) 192.168.2.127.49158 > michals-imac-2.tftp: [no cksum] 26 RRQ "fixup.dat" octet tsize 0 09:51:30.368090 IP (tos 0x0, ttl 64, id 54212, offset 0, flags [none], proto UDP (17), length 41, bad cksum 0 (->20c0)!) michals-imac-2.49683 > 192.168.2.127.49158: [bad udp cksum 0x8666 -> 0x480f!] UDP, length 13 09:51:32.082007 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 56) 192.168.2.127.49158 > michals-imac-2.tftp: [no cksum] 28 RRQ "bootcfg.txt" octet tsize 0 09:51:32.082231 IP (tos 0x0, ttl 64, id 11563, offset 0, flags [none], proto UDP (17), length 120, bad cksum 0 (->c70a)!) michals-imac-2.52972 > 192.168.2.127.49158: [bad udp cksum 0x86b5 -> 0x051a!] UDP, length 92 09:51:32.082266 IP (tos 0x0, ttl 64, id 17323, offset 0, flags [none], proto UDP (17), length 41, bad cksum 0 (->b0d9)!) michals-imac-2.49683 > 192.168.2.127.49158: [bad udp cksum 0x8666 -> 0x480f!] UDP, length 13 09:51:33.271993 IP (tos 0x0, ttl 64, id 43260, offset 0, flags [none], proto UDP (17), length 41, bad cksum 0 (->4b88)!) michals-imac-2.58276 > 192.168.2.127.49158: [bad udp cksum 0x8666 -> 0x267e!] UDP, length 13 09:51:35.088242 IP (tos 0x0, ttl 64, id 42305, offset 0, flags [none], proto UDP (17), length 41, bad cksum 0 (->4f43)!) michals-imac-2.49683 > 192.168.2.127.49158: [bad udp cksum 0x8666 -> 0x480f!] UDP, length 13

old start.elf (with successful loading of fixup.dat):

sh-3.2# tcpdump -i en0 -vvv port tftp tcpdump: listening on en0, link-type EN10MB (Ethernet), capture size 262144 bytes 09:53:37.894385 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 49) 192.168.2.127.49153 > michals-imac-2.tftp: [no cksum] 21 RRQ "bootcode.bin" octet 09:53:37.931535 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 48) 192.168.2.127.49154 > michals-imac-2.tftp: [no cksum] 20 RRQ "bootsig.bin" octet 09:53:38.042132 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 63) 192.168.2.127.49153 > michals-imac-2.tftp: [no cksum] 35 RRQ "e1ffda08/start.elf" octet tsize 0 09:53:38.048069 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49154 > michals-imac-2.tftp: [no cksum] 29 RRQ "autoboot.txt" octet tsize 0 09:53:38.048389 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 55) 192.168.2.127.49155 > michals-imac-2.tftp: [no cksum] 27 RRQ "config.txt" octet tsize 0 09:53:38.055735 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49156 > michals-imac-2.tftp: [no cksum] 29 RRQ "recovery.elf" octet tsize 0 09:53:38.056135 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 54) 192.168.2.127.49157 > michals-imac-2.tftp: [no cksum] 26 RRQ "start.elf" octet tsize 0 09:53:39.467999 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 54) 192.168.2.127.49158 > michals-imac-2.tftp: [no cksum] 26 RRQ "fixup.dat" octet tsize 0 09:53:39.616838 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49153 > michals-imac-2.tftp: [no cksum] 29 RRQ "recovery.elf" octet tsize 0 09:53:39.617341 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 55) 192.168.2.127.49154 > michals-imac-2.tftp: [no cksum] 27 RRQ "config.txt" octet tsize 0 09:53:39.618202 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 56) 192.168.2.127.49155 > michals-imac-2.tftp: [no cksum] 28 RRQ "dt-blob.bin" octet tsize 0 09:53:39.652275 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49156 > michals-imac-2.tftp: [no cksum] 29 RRQ "recovery.elf" octet tsize 0 09:53:39.652844 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 55) 192.168.2.127.49157 > michals-imac-2.tftp: [no cksum] 27 RRQ "config.txt" octet tsize 0 09:53:39.903003 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 56) 192.168.2.127.49158 > michals-imac-2.tftp: [no cksum] 28 RRQ "bootcfg.txt" octet tsize 0 09:53:39.903779 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 56) 192.168.2.127.49159 > michals-imac-2.tftp: [no cksum] 28 RRQ "cmdline.txt" octet tsize 0 09:53:39.904367 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 65) 192.168.2.127.49160 > michals-imac-2.tftp: [no cksum] 37 RRQ "aros-armeb-raspi.img" octet tsize 0 09:53:39.905018 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49161 > michals-imac-2.tftp: [no cksum] 29 RRQ "armstub8.bin" octet tsize 0 09:53:39.905369 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 60) 192.168.2.127.49162 > michals-imac-2.tftp: [no cksum] 32 RRQ "armstub8-32.bin" octet tsize 0 09:53:39.905746 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 57) 192.168.2.127.49163 > michals-imac-2.tftp: [no cksum] 29 RRQ "armstub7.bin" octet tsize 0 09:53:39.906121 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 56) 192.168.2.127.49164 > michals-imac-2.tftp: [no cksum] 28 RRQ "armstub.bin" octet tsize 0 09:53:39.906520 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 65) 192.168.2.127.49165 > michals-imac-2.tftp: [no cksum] 37 RRQ "aros-armeb-raspi.img" octet tsize 0 09:53:40.250867 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 63) 192.168.2.127.49166 > michals-imac-2.tftp: [no cksum] 35 RRQ "aros-armeb-bsp.rom" octet tsize 0 09:53:40.330702 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 69) 192.168.2.127.49167 > michals-imac-2.tftp: [no cksum] 41 RRQ "bcm2710-rpi-3-b-plus.dtb" octet tsize 0 09:53:40.491490 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 55) 192.168.2.127.49168 > michals-imac-2.tftp: [no cksum] 27 RRQ "config.txt" octet tsize 0

Additionally:

sh-3.2# tcpdump -i en0 -vvv port 49158 tcpdump: listening on en0, link-type EN10MB (Ethernet), capture size 262144 bytes 09:55:03.386577 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 54) 192.168.2.127.49158 > michals-imac-2.tftp: [no cksum] 26 RRQ "fixup.dat" octet tsize 0 09:55:03.386733 IP (tos 0x0, ttl 64, id 47563, offset 0, flags [none], proto UDP (17), length 41, bad cksum 0 (->3ab9)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x8666 -> 0x177a!] UDP, length 13 09:55:03.386868 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.386916 IP (tos 0x0, ttl 64, id 40961, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->528c)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0x703a!] UDP, length 516 09:55:03.387165 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.387212 IP (tos 0x0, ttl 64, id 47241, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->3a04)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0x6576!] UDP, length 516 09:55:03.387414 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.387515 IP (tos 0x0, ttl 64, id 64860, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->f530)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0xfe6d!] UDP, length 516 09:55:03.387789 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.387836 IP (tos 0x0, ttl 64, id 25957, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->8d28)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0xee18!] UDP, length 516 09:55:03.388037 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.388083 IP (tos 0x0, ttl 64, id 53362, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->221b)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0x2b5e!] UDP, length 516 09:55:03.388252 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.388348 IP (tos 0x0, ttl 64, id 61944, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->95)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0xb395!] UDP, length 516 09:55:03.388532 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.388648 IP (tos 0x0, ttl 64, id 55506, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->19bb)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0xfd19!] UDP, length 516 09:55:03.388914 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.388963 IP (tos 0x0, ttl 64, id 63098, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->fc12)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0xa1a6!] UDP, length 516 09:55:03.389165 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.389280 IP (tos 0x0, ttl 64, id 7059, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->d6fa)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0x30ab!] UDP, length 516 09:55:03.389542 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.389608 IP (tos 0x0, ttl 64, id 28616, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->82c5)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0xc130!] UDP, length 516 09:55:03.389788 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.389843 IP (tos 0x0, ttl 64, id 14637, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->b960)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0xfdca!] UDP, length 516 09:55:03.390036 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.390090 IP (tos 0x0, ttl 64, id 9872, offset 0, flags [none], proto UDP (17), length 544, bad cksum 0 (->cbfd)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x885d -> 0x492e!] UDP, length 516 09:55:03.390292 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.390356 IP (tos 0x0, ttl 64, id 36820, offset 0, flags [none], proto UDP (17), length 455, bad cksum 0 (->6312)!) michals-imac-2.62114 > 192.168.2.127.49158: [bad udp cksum 0x8804 -> 0xc809!] UDP, length 427 09:55:03.390524 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 32) 192.168.2.127.49158 > michals-imac-2.62114: [no cksum] UDP, length 4 09:55:03.820461 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 56) 192.168.2.127.49158 > michals-imac-2.tftp: [no cksum] 28 RRQ "bootcfg.txt" octet tsize 0 09:55:03.820707 IP (tos 0x0, ttl 64, id 47916, offset 0, flags [none], proto UDP (17), length 120, bad cksum 0 (->3909)!) michals-imac-2.49180 > 192.168.2.127.49158: [bad udp cksum 0x86b5 -> 0x13ea!] UDP, length 92

pelwell commented 5 years ago

I think we're getting somewhere, but there isn't enough information in the traces to show that the server's responses are the same (or different) in the success and failure cases.

To save lots of back and forth, can you capture the traffic?:

# tcpdump -i en0 -w failure.pcap port 49158
# tcpdump -i en0 -w success.pcap port 49158

Afterwards you can view the capture to see that it looks correct with:

# tcpdump -r failure.pcap -vvv

If you can upload both capture files somewhere - DropBox, Google Drive etc. - I'll take a look ASAP.

michalsc commented 5 years ago

here you go: https://drive.google.com/open?id=1zq4L5u16nKh3lGYvsmZKAEOOEiT7-gvT

michalsc commented 5 years ago

And the same for raspberrypi 2b: https://drive.google.com/open?id=1XfWnOpbcnu3__5j-JMZp69loxOsKFknI

pelwell commented 5 years ago

Thanks - the captures are essentially the same for the two Pis, so you can stick to one from now on.

The data shows the Pi attempting to open fixup.dat, and the server sending a response containing the size of the file. In the success case the Pi then sends an ACK for block 0 which kick-starts the transmission, whereas in the failure case it goes on to the next file. From the logs you sent earlier I can see that there is no significant delay (400us) between requesting fixup.dat and recovery.elf, which suggests the response was received and rejected rather than going astray.

The only difference in the server responses is the file size - 6567 for success and 6660 for failure. Can you try a hybrid system with the latest start.elf but the older, smaller fixup.dat? Since the Pi can't know the content of the file at that stage, if that changes the behaviour then the problem is size-related.

If that doesn't change things, try the opposite pairing - old start.elf with new fixup.dat.

Note that the firmware should ignore a fixup.dat that doesn't match the .elf file, so you will still end up with only 128MB RAM.

pelwell commented 5 years ago

P.S. Further captures won't tell us any more at this stage - just look whether or not fixup.dat was fetched.

michalsc commented 5 years ago

old fixup.dat and new start.elf - fixup.dat is not loaded: Raspberry Pi Bootcode Read File: config.txt, 70 Read File: start.elf, 2847044 (bytes)

new fixup.dat and old start.elf - fixup.dat is loaded: Raspberry Pi Bootcode Read File: config.txt, 70 Read File: start.elf, 2821476 (bytes) Read File: fixup.dat, 6660 (bytes)

michalsc commented 5 years ago

BTW. Sometimes I see in the log of dnsmasq following:

Sep 23 12:06:29 dnsmasq-tftp[48750]: failed sending /Users/michal/Projects/AROS.raspi-armbe/bin/raspi-armeb/AROS/fixup.dat to 192.168.2.126

Maybe there is some ack missing?

pelwell commented 5 years ago

The pcap files show the server repeating the file size response, so it is treating the lack of a response about that file as an indication of packet loss, causing a retry. I presume that after some number of retries you get the error message above. So yes, it is the unsent ACK 0 from the Pi that causes the error, but that is a symptom rather than the underlying problem.

michalsc commented 5 years ago

I have done binary search on commits and noted where the combination fixup.dat + start.elf loads properly (+) or fails to load fixup.dat (-). After each change of files I have also restarted the tftp server since I have noted that without that it may fail to send data (bootcode.bin uses fixed communication port numbers which do not vary from boot to boot, so once fixup.dat stalls it will stall forever for given ip:port combination). Here the results:

pelwell commented 5 years ago

There is a debug build of bootcode.bin here that adds a bit of logging in a key place. What does it report in the failure case?

michalsc commented 5 years ago

Here's the log in failure case:

Raspberry Pi Bootcode

Found SD card, config.txt = 0, start.elf = 0, recovery.elf = 0, timeout = 0 Trying USB Hub device found at addr 4, enumerating HUB Initialise hub Found 5 ports, multi_tt = 1 Setting interface 0 Enabling PORT POWER on port 1 Enabling PORT POWER on port 2 Enabling PORT POWER on port 3 Enabling PORT POWER on port 4 Enabling PORT POWER on port 5 Waiting for devices to respond to reset Found device on port 1 Found highspeed device Device found: type = Ethernet adapter, addr = 5 Trying booting from Ethernet device addr 5 Initialise ethernet with MAC b8:27:eb:85:c0:f2 Wait for Link up Link up Sending DHCP request Waiting for dhcp_reply Done ARP for 192.168.2.112 got 50:e5:49:c5:c9:24 tftp_rrq 7285c0f2/start.elf - c0a8027e 35, 43, 63, 77 -> 0 tftp_rrq autoboot.txt - c0a8027e 29, 37, 57, 71 -> 0 tftp_rrq config.txt - c0a8027e 27, 35, 55, 69 -> 0 Read File: config.txt, 70 (bytes)

Raspberry Pi Bootcode (Sep 24 2018 04:13:39) Read File: config.txt, 70 tftp_rrq recovery.elf - c0a8027e 29, 37, 57, 71 -> 0 tftp_rrq start.elf - c0a8027e 26, 34, 54, 68 -> 0 Read File: start.elf, 2847044 (bytes) tftp_rrq fixup.dat - c0a8027e 26, 34, 54, 68 -> 0 MESS:00:00:07.976631:0: brfs: File read: /mfs/sd/config.txt MESS:00:00:07.980684:0: brfs: File read: 70 bytes MESS:00:00:07.994693:0: HDMI:EDID error reading EDID block 0 attempt 0 MESS:00:00:08.000759:0: HDMI:EDID error reading EDID block 0 attempt 1 MESS:00:00:08.007008:0: HDMI:EDID error reading EDID block 0 attempt 2 MESS:00:00:08.013258:0: HDMI:EDID error reading EDID block 0 attempt 3 MESS:00:00:08.019508:0: HDMI:EDID error reading EDID block 0 attempt 4 MESS:00:00:08.025758:0: HDMI:EDID error reading EDID block 0 attempt 5 MESS:00:00:08.032008:0: HDMI:EDID error reading EDID block 0 attempt 6 MESS:00:00:08.038258:0: HDMI:EDID error reading EDID block 0 attempt 7 MESS:00:00:08.044508:0: HDMI:EDID error reading EDID block 0 attempt 8 MESS:00:00:08.050758:0: HDMI:EDID error reading EDID block 0 attempt 9 MESS:00:00:08.056771:0: HDMI:EDID giving up on reading EDID block 0 MESS:00:00:08.074287:0: brfs: File read: /mfs/sd/config.txt MESS:00:00:08.078316:0: HDMI:Setting property pixel encoding to Default MESS:00:00:08.084502:0: HDMI:Setting property pixel clock type to PAL MESS:00:00:08.090665:0: HDMI:Setting property content type flag to No data MESS:00:00:08.097262:0: HDMI:Setting property fuzzy format match to enabled MESS:00:00:08.104027:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined MESS:00:00:08.120284:0: gpioman: gpioman_get_pin_num: pin WL_LPO_CLK not defined MESS:00:00:08.125984:0: gpioman: gpioman_get_pin_num: pin BT_ON not defined MESS:00:00:08.132664:0: gpioman: gpioman_get_pin_num: pin WL_ON not defined MESS:00:00:08.163732:0: gpioman: gpioman_get_pin_num: pin DISPLAY_DSI_PORT not defined MESS:00:00:08.171181:0: gpioman: gpioman_get_pin_num: pin LEDS_PWR_OK not defined MESS:00:00:08.177193:0: hdmi: HDMI:hdmi_get_state is deprecated, use hdmi_get_display_state instead MESS:00:00:08.185932:0: hdmi: HDMI:>>>>>>>>>>>>>Rx sensed, reading EDID<<<<<<<<<<<<< MESS:00:00:08.193679:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 0 MESS:00:00:08.201409:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 1 MESS:00:00:08.208176:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 2 MESS:00:00:08.214947:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 3 MESS:00:00:08.221718:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 4 MESS:00:00:08.228488:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 5 MESS:00:00:08.235260:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 6 MESS:00:00:08.242030:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 7 MESS:00:00:08.248801:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 8 MESS:00:00:08.255573:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 9 MESS:00:00:08.262105:0: hdmi: HDMI:EDID giving up on reading EDID block 0 MESS:00:00:08.267620:0: hdmi: HDMI: No lookup table for resolution group 0 MESS:00:00:08.274206:0: hdmi: HDMI: hotplug attached with DVI support MESS:00:00:08.280383:0: hdmi: HDMI:hdmi_get_state is deprecated, use hdmi_get_display_state instead MESS:00:00:08.289419:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 0 MESS:00:00:08.297152:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 1 MESS:00:00:08.303922:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 2 MESS:00:00:08.310693:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 3 MESS:00:00:08.317465:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 4 MESS:00:00:08.324235:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 5 MESS:00:00:08.331006:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 6 MESS:00:00:08.337777:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 7 MESS:00:00:08.344548:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 8 MESS:00:00:08.351318:0: hdmi: HDMI:EDID error reading EDID block 0 attempt 9 MESS:00:00:08.357853:0: hdmi: HDMI:EDID giving up on reading EDID block 0 MESS:00:00:08.363372:0: hdmi: HDMI: hotplug deassert MESS:00:00:08.368040:0: hdmi: HDMI: HDMI is currently off MESS:00:00:08.373161:0: hdmi: HDMI: changing mode to unplugged MESS:00:00:08.378726:0: hdmi: HDMI:hdmi_get_state is deprecated, use hdmi_get_display_state instead MESS:00:00:08.388170:0: *** Restart logging MESS:00:00:08.391400:0: brfs: File read: 70 bytes MESS:00:00:08.396905:0: Failed to open command line file 'cmdline.txt' MESS:00:00:08.563438:0: brfs: File read: /mfs/sd/aros-armeb-raspi.img MESS:00:00:08.568180:0: Loading 'aros-armeb-raspi.img' to 0x8000 size 0x3baac MESS:00:00:08.575049:0: brfs: File read: 244396 bytes MESS:00:00:08.649695:0: brfs: File read: /mfs/sd/aros-armeb-bsp.rom MESS:00:00:08.654268:0: Loading 'aros-armeb-bsp.rom' to 0x43aac size 0x1b460 MESS:00:00:08.686420:0: Kernel trailer DTOK property says yes MESS:00:00:08.690492:0: brfs: File read: 111712 bytes MESS:00:00:08.708202:0: brfs: File read: /mfs/sd/bcm2709-rpi-2-b.dtb MESS:00:00:08.712855:0: Loading 'bcm2709-rpi-2-b.dtb' to 0x43aac size 0x5e33 MESS:00:00:08.872913:0: brfs: File read: 24115 bytes MESS:00:00:08.876892:0: brfs: File read: /mfs/sd/config.txt MESS:00:00:09.957305:0: gpioman: gpioman_get_pin_num: pin EMMC_ENABLE not defined MESS:00:00:10.074601:0: Device tree loaded to 0x7fcdd00 (size 0x6226) MESS:00:00:10.080193:0: uart: Set PL011 baud rate to 103448.300000 Hz MESS:00:00:10.087050:0: uart: Baud rate change done... MESS:00:00:10.090484:0: uart: Baud rate change done...

pelwell commented 5 years ago

An updated debug bootcode.bin here pays more attention to the per-file transactions. Can you try again?

michalsc commented 5 years ago

I think the first part of log will be most interesting for you:

Raspberry Pi Bootcode

Found SD card, config.txt = 0, start.elf = 0, recovery.elf = 0, timeout = 0 Trying USB Hub device found at addr 4, enumerating HUB Initialise hub Found 5 ports, multi_tt = 1 Setting interface 0 Enabling PORT POWER on port 1 Enabling PORT POWER on port 2 Enabling PORT POWER on port 3 Enabling PORT POWER on port 4 Enabling PORT POWER on port 5 Waiting for devices to respond to reset Found device on port 1 Found highspeed device Device found: type = Ethernet adapter, addr = 5 Trying booting from Ethernet device addr 5 Initialise ethernet with MAC b8:27:eb:85:c0:f2 Wait for Link up Link up Sending DHCP request Waiting for dhcp_reply Done ARP for 192.168.2.112 got 50:e5:49:c5:c9:24 eth_open 7285c0f2/start.elf: file not found eth_open autoboot.txt: file not found eth_open config.txt: open ack (tsize 70) first data (port 49795) EOF (data_len 70) Read File: config.txt, 70 (bytes)

Raspberry Pi Bootcode (Sep 25 2018 00:55:51) Read File: config.txt, 70 eth_open recovery.elf: file not found eth_open start.elf: open ack (tsize 2847044) first data (port 51841) EOF (data_len 2847044) Read File: start.elf, 2847044 (bytes) eth_open fixup.dat: timeout MESS:00:00:08.070833:0: brfs: File read: /mfs/sd/config.txt MESS:00:00:08.074886:0: brfs: File read: 70 bytes

pelwell commented 5 years ago

That timeout message is a big clue, eliminating a lot of possibilities. I'll refine the logging further tomorrow.

maxnet commented 5 years ago

You seem to be running Mac OS on non-Apple hardware (judging by Gigabyte OUI) Did you also try using Linux as operating system on the server, to rule out any Hackintosh specific problems?

==

timeout

@pelwell It would be nice if timeout in bootcode could be raised to say 10 seconds, to be able to deal with any possible packet loss. tftp server retransmits packets automatically if not acknowledged. But right now the bootcode only seems to wait a second or so before timing out, meaning it does not wait for the retransmits and a single packet being lost is fatal.

Related: https://github.com/raspberrypi/firmware/issues/1025 https://github.com/raspberrypi/firmware/issues/991

michalsc commented 5 years ago

Did you also try using Linux as operating system on the server, to rule out any Hackintosh specific problems?

No. Instead I have tried two different tftp servers on the machine and both misbehave with raspberry, but both misbehave differently. In order to verify the tftp itself is not an issue here I have tried connecting to the tftp server from other machines and other tftp clients within same network. All tftp clients but bootcode.bin worked properly. This alone is an argument for me that the issue is not related to the machine I'm using.

maxnet commented 5 years ago

All tftp clients but bootcode.bin worked properly.

Problem is that the bootcode is less tolerant than your typical tftp client.

If there is any packet loss, it is fatal to the bootcode, as indicated in the issues linked. A normal tftp client will only fail if also the 3 packet retransmits are lost.

Also a normal tftp client may not request the file size (it's an optional tftp extension).

michalsc commented 5 years ago

Problem is that the bootcode is less tolerant than your typical tftp client.

If there is any packet loss, it is fatal to the bootcode, as indicated in the issues linked. A normal tftp client will only fail if also the 3 packet retransmits are lost.

Therefore I hope that my opened issue will help to make bootcode more tftp rfc compatible and failure resistant and thus will also help people who reported the two issues you've mentioned.

The problem is, actually, that you do now know how many people are using network booting on raspberry. Until now all tutorials I found suggest using the outdated "next" branch which do not have any issues and I could just happily use it too, unless I try to boot pi 3b+. Once such people will change to the main branch and will upgrade boot files, many of them can be frustrated by finding out that their pxe-boot raspberry configuration suddenly stops working.

maxnet commented 5 years ago

The problem is, actually, that you do now know how many people are using network booting on raspberry. Until now all tutorials I found suggest using the outdated "next" branch which do not have any issues and I could just happily use it too, unless I try to boot pi 3b+. Once such people will change to the main branch and will upgrade boot files, many of them can be frustrated by finding out that their pxe- boot raspberry configuration suddenly stops working.

Do note that piserver uses main branch firmware files (same versions that ship with Raspbian), and does is able to boot Pi 3+.

While I do agree with you that it would be nice if someone would fix the firmware's shortcomings. I do also think there may be other contributing factors that it is not working in your case, of which your choice of operating system (and its community contributed NIC drivers of varying quality) could be one of them.

pelwell commented 5 years ago

I'm open to reviewing timeouts and retries - everything seems to be one second - once I've understood this problem.

There's another debug bootcode.bin in the usual place - it gets more chatty once it gets to the fixup file.

michalsc commented 5 years ago

And there's debug log. I have attached two files there. raspi-50.cap is the non working one, whereas raspi-51.cap is with older fixup.dat + startup.elf combination.

bootlog

Sorry I can help only this way. Let me know if I can do anything else helping you :)

pelwell commented 5 years ago

Thanks. Those results are not at all what I expected, but that's no bad thing. At first glance I can't work out what's going on...

I can only apologise about the slow progress on this - just keep on running the tests and posting the results.

pelwell commented 5 years ago

Try this one for size.

michalsc commented 5 years ago

Here it goes

pelwell commented 5 years ago

Ping!

michalsc commented 5 years ago

Pong!