netbootxyz / netboot.xyz

Your favorite operating systems in one place. A network-based bootable operating system installer based on iPXE.
https://netboot.xyz
Apache License 2.0
8.43k stars 650 forks source link

USB image - Network unreachable #283

Open abitrolly opened 5 years ago

abitrolly commented 5 years ago

netboot.xyz.usb USB image from https://boot.netboot.xyz/ipxe/netboot.xyz.usb fails to reach the network. Not from real hardware connected to LAN, not from QEMU on this laptop. The error message is the same.

$ md5sum netboot.xyz.usb
10f28984666d1e836f7b650615088708  netboot.xyz.usb

$ sudo dd if=netboot.xyz.usb of=/dev/sdc
2752+0 records in
2752+0 records out
1409024 bytes (1.4 MB, 1.3 MiB) copied, 2.20912 s, 638 kB/s

$ sudo qemu-system-x86_64 -hda /dev/sdc

image

abitrolly commented 5 years ago

While debugging it appears that nslookup resolves addresses as IPv6.

iPXE> nslookup address ipxe.org
iPXE> echo ${address}
2001:ba8:0:1d4::6950:5845

http://ipxe.org/cmd/nslookup

But my network doesn't support IPv6.

abitrolly commented 5 years ago

Route command shows two net0 addresses too.

iPXE> route
net0: 192.168 ...
net0: fe80:: ...

My router seems to be IPv6 compatible, and that's probably the reason I get IPv6 address and get resolve names, but it seems that upstream provider is not. How to make iPXE fallback to IPv4 is IPv6 fails?

pali commented 5 years ago

Same problem there. It is because iPXE itself has broken IPv6 support.

I compiled my own iPXE version with disabled IPv6 support. Then network in this iPXE started working correctly.

But because https://boot.netboot.xyz/menu.ipxe is doing "Attempting to chain to latest version..." it is not possible to use my own working version of iPXE with netboot.xyz. netboot.xyz just replace working iPXE with its own iPXE version with broken IPv6 support and therefore netboot.xyz is unusable.

antonym commented 5 years ago

Hmmm... yeah, I think I see the problem. I'm just using the dhcp command so it's trying all interfaces and trying all types of networking regardless to if the network upstream can support it.

I could try doing:

ifconf -c dhcp || ifconf -c ipv6 || goto netconfig

https://github.com/antonym/netboot.xyz/blob/master/ipxe/disks/netboot.xyz#L18

That may force ipv4 first, then fallback to ipv6 and then go to manual config. (http://ipxe.org/cmd/ifconf)

antonym commented 5 years ago

Just dumping notes here, but I could also do some validation to ensure we can hit boot.netboot.xyz:

ping --count 1 boot.netboot.xyz && goto ipv4_good || goto ipv4_bad

abitrolly commented 5 years ago

@antonym ifconf -c dhcp still configures ipv6. Seems like there is no way to disable it. I asked here http://lists.ipxe.org/pipermail/ipxe-devel/2018-November/006337.html still waiting for replies.

ping: command not found.

netboot.xyz 1.04

NiKiZe commented 5 years ago

ping could be disabled (by firewalls), i would suggest trying to hit a http/https endpoint to determine if it is good and then have fallback, maybe even check the error code.

@abitrolly I remember there being a few other discussions on the list about ipv6 recently, maybe that could help.

Just to have it in text ... error in image is http://ipxe.org/280a6011 Network unreachable

hugalafutro commented 3 years ago

Hi, I am hitting this exact same issue with netboot.xyz running inside a docker container, is there any way to make this work? I use OpenWRT router, I tried disabling the ipv6 interfaces, but it just killed my internet even though I don't use ipv6 nor does my isp provide it.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

abitrolly commented 3 years ago

@hugalafutro if your ISP doesn't support ipv6, how could disabling it kill the internet? Can you post what you do?

pali commented 3 years ago

@abitrolly: I have not tested iPXE again but at the time when I wrote my last commit it was because iPXE prefered IPv6 connection and if there was none from ISP then it it obiously killed the internet. Note that on the local network can be running DHCPv6 server or RA packets with just non-routable ULA addresses, it is perfectly fine to have assigned ULA or LL address even when you do not have IPv6 internet connection.

abitrolly commented 3 years ago

@pali do you mean iPXE device could access the network, but could not access the internet?

pali commented 3 years ago

@abitrolly: basically yes.

And there was also another issue in iPXE. On statefull DHCPv6-only network (when ISP or other router provides real IPv6 internet) iPXE detected that it is IPv6 capable (probably via RA) and prefered to use IPv6. But iPXE was not able to assign IPv6 address from statefull DHCPv6 server (maybe it did not have implemented statefull DHCPv6 client?) and just throwed error that Network unreachable.

pali commented 3 years ago

So in both cases, completely disabling IPv6 in iPXE fixed those issues. iPXE in never tried to use IPv6 and so it use IPv4 which worked.

razamatan commented 3 years ago

I'm hitting this issue chainloading netboot.xyz from my working ipxe with no ipv6 support...

lukyrys commented 3 years ago

Same problem via PXE, tried many version, both DHCP or DHCP-undionly tested with same result.

obrazek

when i hit to "Manual network configuration" and I'll just go through it without change any value then its working. (https still same error but http works). I think the eth0 linkup is late before the dhcp request is in progress. Some extra check if eth is up or sleep may help?

romanchyla commented 3 years ago

was facing the same issue (v2.0.33), on OpenWrt disabling IPv6 solved it:


'sysctl -w net.ipv6.conf.all.disable_ipv6=1',
        'echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6',
        'sysctl -w net.ipv6.conf.default.disable_ipv6=1'```
woeisme commented 2 years ago

Came across the same issue today, afaik ipv6 already disabled on my network (openwrt / lede)

xeijin commented 2 years ago

@woeisme the option in OpenWRT is buried pretty deep, but it worked for me after unticking the box:

Screenshot 2021-10-31 at 16 33 43 Screenshot 2021-10-31 at 16 27 00
github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

razamatan commented 2 years ago

still a problem... still needs some love.

foxt commented 2 years ago

Can confirm, on all 3 machines I've tried netboot on, all of them required me to go into manual configuration and press enter a few times to get it to work

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

hello-smile6 commented 2 years ago

STILL BROKEN, BOT! STOP CLOSING IT!

Openanonwriter commented 2 years ago

Having the same issue, we do not have IPV6 on the IPXE machine. Did a capture using wireshark, I can see that it is looking up AAAA records and failing as it should, but does not look for IPV4 after, just hard fails. Manually setting IPv4 it works, but dhcp does not.

donhector commented 2 years ago

I was also getting the network unreachable error message when trying to spin up a Libvirt VM that was configured to boot from its NIC interface (virtio). This VM uses the default BIOS firmware (not UEFI)

The NIC was configured to be attached to the host network (ie not attached to the libvirt's default network, which uses nat via its own virb0 device, but to a custom libvirt network I had created that uses bridge mode via the br0 host device.)

This allows the libvirt VM to receive an IP in the same range as the Libvirt host and pretty much any other devices in my home LAN (ie 192.168.1.0/24)

My home LAN router is OpenWRT, which provides DNS, DHCP (ie, is the one leasing those 192.168.1.0/24 IPS) and TFTP amongst other stuff.

In the OpenWRT UI I had configured TFPT to serve directly from the url http://boot.netboot.xyz/ipxe/netboot.xyz.kpxe (since my VM is not configured for UEFI)

image

On boot, the VM would receive an IP in the desired 192.168.1.0/24 range, and would also receive the desired TFPT url provided by OpenWRT, but then it would fail trying to pull the actual netboot.xyz.kpxe file from the web with the dreaded network unreachable error message.

Thanks to the pointers here around IPv6 being the culprit and following @xeijin suggestion, totally disabling DHCPv6 in OpenWRT for my LAN fixed the problem for me:

image

Hope that helps other people as well as a temporary workaround until the ipxe file itslef handles resorting to IPv4 when needed

NOTE: In my setup, using https in the url did not work, I had to settle using http.

NOTE2: Your mileage might vary, but it was no big deal for me to disable IPv6 inside my Lan.

steve6375 commented 1 year ago

My ISP does not support ipv6. Any fix for netboot.xyz.iso ? Seems to be magically working now! Maybe my ISP changed something?

SimonMcN commented 1 year ago

okay. I'm stumped. If I do m at boot and choose all the options it presents it works. but it doesn't work automatically ?

joshenders commented 1 year ago

okay. I'm stumped. If I do m at boot and choose all the options it presents it works. but it doesn't work automatically ?

This was my experience as well.

I'm using:

# netboot.xyz bootloaders generated at Thu Feb 16 00:54:02 UTC 2023
# iPXE Commit: https://github.com/ipxe/ipxe/commit/cff857461be443339aa39d614635d9a4eae8f8b2
...
f9432f382b8c507d15f7ad17a4f432e91a89d0e680fb95fa80d29cedd96d959f *netboot.xyz.iso

HP Microserver Gen10+ with Intel i350-based 1GbE directly connected to the ONT of an IPv4-only fiber ISP.

dedene commented 1 year ago

okay. I'm stumped. If I do m at boot and choose all the options it presents it works. but it doesn't work automatically ?

This was my experience as well.

I'm using:


# netboot.xyz bootloaders generated at Thu Feb 16 00:54:02 UTC 2023

# iPXE Commit: https://github.com/ipxe/ipxe/commit/cff857461be443339aa39d614635d9a4eae8f8b2

...

f9432f382b8c507d15f7ad17a4f432e91a89d0e680fb95fa80d29cedd96d959f *netboot.xyz.iso

HP Microserver Gen10+ with Intel i350-based 1GbE directly connected to the ONT of an IPv4-only fiber ISP.

I'm observing the same with latest version, regardless of ipv6 configured on the network or not.

Was it working for you with that version?

joshenders commented 1 year ago

okay. I'm stumped. If I do m at boot and choose all the options it presents it works. but it doesn't work automatically ?

This was my experience as well.

I'm using:


# netboot.xyz bootloaders generated at Thu Feb 16 00:54:02 UTC 2023

# iPXE Commit: https://github.com/ipxe/ipxe/commit/cff857461be443339aa39d614635d9a4eae8f8b2

...

f9432f382b8c507d15f7ad17a4f432e91a89d0e680fb95fa80d29cedd96d959f *netboot.xyz.iso

HP Microserver Gen10+ with Intel i350-based 1GbE directly connected to the ONT of an IPv4-only fiber ISP.

I'm observing the same with latest version, regardless of ipv6 configured on the network or not.

Was it working for you with that version?

I ended up trying the .iso version written to a usb flash drive using balena etcher.

gema-arta commented 1 year ago

in

QEMU emulator version 8.0.0 (v8.0.0-12024-gd6b71850be-dirty)
Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers
qemu-system-x86_64 -M q35 -cpu qemu64 -m 4G -L . -nic user,model=virtio -cdrom netboot.xyz-multiarch.iso  -boot menu=on -usb

or use

qemu-system-x86_64 -M q35 -cpu qemu64 -m 4G -L . -nic user,model=virtio-net-pci-transitional -hda netboot.xyz-multiarch.img  -boot menu=on -usb

or basically

qemu-system-x86_64 -nic user,tftp=.,bootfile=http://boot.netboot.xyz/ipxe/netboot.xyz.lkrn -nographic

result

iPXE 1.21.1+ (gb00935) -- Open Source Network Boot Firmware -- https://ipxe.org
Features: DNS HTTP HTTPS iSCSI TFTP SRP VLAN AoE ELF MBOOT PXE bzImage COMBOOT M
enu PXEXT
netboot.xyz - v2.x

Configuring (net0 52:54:00:12:34:56)...... ok
https://boot.netboot.xyz/menu.ipxe... Network unreachable (https://ipxe.org/280a6011)
HTTPS appears to have failed... attempting HTTP
http://boot.netboot.xyz/menu.ipxe... Network unreachable (https://ipxe.org/280a6011)
HTTP has failed, localbooting...
No bootable device.
QEMU: Terminated

show ip6 get local link address who added automatically image

This is qemu problem?

razamatan commented 1 year ago

it's not specific to qemu. i'm still having this issue on physical/standalone pc's.

gema-arta commented 1 year ago

@razamatan

it's not specific to qemu. i'm still having this issue on physical/standalone pc's.

This is globally ipxe ipv6/dns name resolving and dual v4/v6 stack trouble. But the ipv6=off qemu workaround works for me. qemu-system-x86_64 -nic user,ipv6=off,tftp=.,bootfile=http://boot.netboot.xyz/ipxe/netboot.xyz.lkrn -nographic

antonym commented 1 year ago

The issue is really upstream with iPXE, I could probably build images that were separate single stack to see if that would work better but ideally I wouldn't have to create more images.

If v6 is enabled on the network, it prefers the v6 DNS regardless of whether or not it'll work, and I haven't found a good way to shut off the stack and prefer v4 DNS if v6 is present. Trying to disable or work around it doesn't work in the script.

dedene commented 1 year ago

Is there any pointer you could give on how to build these single stack (i.e. ipv4 only) images? I’d be happy to play around with it and see if it solves the issue.

I agree ideally it should be solved in iPXE but in the meanwhile this could be a good workaround.

antonym commented 12 months ago

I made a few changes to the boot-loader in development to see if it can more gracefully handle v4/v6 hybrid scenarios so the end result is that you end up in the menu and I can hopefully avoid having to split the disks up into separate stacks. Can anyone experiencing the issue try these boot-loaders out and see if it helps any in the scenario that usually doesn't work for you? I'm not currently utilizing v6 in my environment so it's difficult to simulate.

https://s3.amazonaws.com/dev.boot.netboot.xyz/9a04a0741e078c770c54c99bce5fdfb960507e98/index.html

If it see's netX/dns6 set from DHCP, it'll attempt v6 connectivity first. If that fails, it'll clear out the netX/dns6 variable and attempt another reload. If netX/dns6 is not set, it should abandon trying to use the v6 stack and fall back to booting over v4.

Actual changes are:

https://github.com/netbootxyz/netboot.xyz/commit/5e9d77fefa22931acef808280ca48c16d040e3e5 https://github.com/netbootxyz/netboot.xyz/commit/9a04a0741e078c770c54c99bce5fdfb960507e98

dedene commented 12 months ago

@antonym Awesome, thx for having a look at this! I can confirm it is an improvement: with the new images it automatically boots into the Netboot.xyz menu without manually setting the network config.

However after selecting an image (for installation or live cd) I still get the same Network Unreachable error. I made a small screen recording to show: https://cln.sh/97kZdn2WNTqcM50ZCkMp You can see when booting in the menu it says "attempting IPv4..." but then after selecting a distro it fails again probably due to ipv6?

antonym commented 12 months ago

It looks like it's failing on https but passing on http. The menu can be pulled from either, but anything pulled from Github is going to be https so it may be worth digging into why it's unable to load https. Date/time on machine might be one issue.

dedene commented 11 months ago

@antonym Sorry for the late reply. Indeed it's failing on https. But when pressing 'm' and just hitting enter for manual network config, https seems to works just fine. Also, NTP is running on the host, so I'm quite sure the Date/Time is configured correctly for the VM.

Any ideas why the https could be failing for the automatic flow, but working using the failsafe menu and manual (ipv4) network configuration?

ManuLinares commented 1 month ago

Hasn't been any news on this. Still on my network I have all ipv6 dhcp6 disabled, but when I start my machine I always have to press "m" to enter failsafe mode, then accepts the defaults and it works. Automatically it doesn't.

Any fix for this? Any workaround?

nrvale0 commented 4 days ago

I guess I'm not adding much here except to say I'm seeing the exact same behavior as @ManuLinares et al.