SerenityOS / serenity

The Serenity Operating System 🐞
https://serenityos.org
BSD 2-Clause "Simplified" License
30.38k stars 3.18k forks source link

DHCPClient: Doesn't work for class C networks #4412

Open danboid opened 3 years ago

danboid commented 3 years ago

I had my first successful bare metal Serenity boot in some time today and one of the main things I wanted to test was to see if I could get networking to work. No such luck unfortunately.

Yes, the machine I was testing on had a Realtek 8139 NIC in it and I could see in the boot log that it had detected it OK but DHCPClient hadn't fetched / assigned it a valid address according to ifconfig so networking didn't work. The NIC was attached to a class C network but serenity assigned it a class A address.

Also, if you run DHCPClient from the terminal as root you get the error in the attached screenshot.

dhcp

I presume that means it is only intended to be run at startup?

linusg commented 3 years ago

Also, if you run DHCPClient from the terminal as root you get the error in the attached screenshot.

That doesn't look like a root shell in your screenshot :thinking:

I presume that means it is only intended to be run at startup?

It's just non-existent error handling. Basically: something fails, and it doesn't like that and decides to crash :smile: Not sure what the error is though.


cc @alimpfard

danboid commented 3 years ago

Ah yes, you're right @linusg but you get the exact same error if you run it as root.

danboid commented 3 years ago

Its also worth mentioning that Browser freeze crashed when I clicked on a link - prob because networking wasn't actually working but serenity thought it was?

alimpfard commented 3 years ago

Not sure what the error is though

It's trying to bind to udp port 68 to listen to the incoming DHCP responses, and that's somehow failing. as anon, this is expected (68 is a privileged port) but it shouldn't fail like this for root.

DHCPClient should log in the debug console if it gets an IP (or it fails to negotiate one), is there anything like that happening?

danboid commented 3 years ago

Debug console on a bare metal boot?

alimpfard commented 3 years ago

serial port? if that's not accessible...ouch. what's the "assigned" IP? 10.0.2.15?

danboid commented 3 years ago

Unfortunately I won't be able to check anything for you until I'm next back in work which could be a few weeks off now. If I go into work again this week I'll do whatever testing is required.

alimpfard commented 3 years ago

Time is perhaps the only thing we have a lot of :)

danboid commented 3 years ago

@awesomekling does all the debug console stuff get output to the serial port by default, after booting?

danboid commented 3 years ago

@deoxxa - you've had DHCP working bare metal have you? On a class C network? Are there any catches?

danboid commented 3 years ago

DHCP works on the same machine / network under both Linux and NetBSD.

danboid commented 3 years ago

Here's the output when booting with serial debug:

[Kernel]: CPU[0]: Supported features: nx pae pge sse tsc sep mmx sse2 sse3
[Kernel]: CPU[0]: No RDRAND support detected, randomness will be poor
[Kernel]: Installing Unhandled Handlers
[Kernel]: MM: boot_pdpt @ P0x00109000
[Kernel]: MM: boot_pd0 @ P0x0010a000
[Kernel]: MM: boot_pd3 @ P0x0010b000
[Kernel]: MM: kernel range: P0x00100000 - P0x00251000
[Kernel]: MM: kmalloc range: P0x00251000 - P0x00652000
[Kernel]: MM: Multiboot mmap: base_addr = 0x00000000, length = 0x0009f800, type = 0x1
[Kernel]: MM: Multiboot mmap: base_addr = 0x0009f800, length = 0x00000800, type = 0x2
[Kernel]: MM: Multiboot mmap: base_addr = 0x000f0000, length = 0x00010000, type = 0x2
[Kernel]: MM: Multiboot mmap: base_addr = 0x00100000, length = 0x3eef0000, type = 0x1
[Kernel]: MM: Multiboot mmap: base_addr = 0x3eff0000, length = 0x00003000, type = 0x4
[Kernel]: MM: Multiboot mmap: base_addr = 0x3eff3000, length = 0x0000d000, type = 0x3
[Kernel]: MM: Multiboot mmap: base_addr = 0xfec00000, length = 0x01400000, type = 0x2
[Kernel]: Super physical region: P0x00700000 - P0x007ff000
[Kernel]: User physical region: P0x00800000 - P0x3eff0000
[Kernel]: Early access to ACPI tables for interrupt setup
[Kernel]: Interrupts: Switch to Legacy PIC mode
[Kernel]: PIC(i8259): cascading mode, vectors 0x50-0x5f
[Kernel]: Interrupts: Detected Dual i8259
[Kernel]: ACPI: Using RSDP @ P0x000f6cf0
[Kernel]: ACPI: Main Description Table valid? true
[Kernel]: ACPI: Using RSDT, Enumerating tables @ P0x3eff3040
[Kernel]: ACPI: RSDT Revision 1, Total length - 44
[Kernel]: ACPI: Initializing Fixed ACPI data
[Kernel]: ACPI: Searching for the Fixed ACPI Data Table
[Kernel]: ACPI: Fixed ACPI data, Revision 1, Length 116 bytes
[Kernel]: ACPI: DSDT P0x3eff3180
[Kernel]: ACPI: Dynamic Parsing Enabled, Can parse AML
[Kernel]: I8042: Dual channel controller
[Kernel]: PS2MouseDevice: Mouse wheel enabled!
[Kernel]: Starting SerenityOS...
[Kernel]: RTC: Year: 2020, month: 1, day: 2, hour: 2, minute: 16, second: 40
[Kernel]: ACPI: CMOS RTC Present
[Kernel]: PIT: 1000 Hz, square wave (4a9)
[Kernel]: RTC: Set rate to 6
[Kernel]: RTC: Set frequency to 1024 Hz
[Kernel]: VC: Switch to 0 (0xc0397570)
[colonel(0:0)]: Scheduler[0]: idle loop running
[init_stage2(1:1)]: Testing PCI via manual probing... 
[init_stage2(1:1)]: PCI IO Supported!
[init_stage2(1:1)]: PCI: Using I/O instructions for PCI configuration space access
[init_stage2(1:1)]: PCI [0000:00:00.00] (8086:2570)
[init_stage2(1:1)]: PCI [0000:00:02.00] (8086:2572)
[init_stage2(1:1)]: PCI [0000:00:1d.00] (8086:24d2)
[init_stage2(1:1)]: PCI [0000:00:1d.01] (8086:24d4)
[init_stage2(1:1)]: PCI [0000:00:1d.02] (8086:24d7)
[init_stage2(1:1)]: PCI [0000:00:1d.03] (8086:24de)
[init_stage2(1:1)]: PCI [0000:00:1d.07] (8086:24dd)
[SyncTask(2:2)]: SyncTask is running
[init_stage2(1:1)]: PCI [0000:00:1e.00] (8086:244e)
[init_stage2(1:1)]: PCI [0000:01:00.00] (1106:3044)
[init_stage2(1:1)]: PCI [0000:01:03.00] (10ec:8139)
[init_stage2(1:1)]: PCI [0000:00:1f.00] (8086:24d0)
[init_stage2(1:1)]: PCI [0000:00:1f.02] (8086:24d1)
[init_stage2(1:1)]: PCI [0000:00:1f.03] (8086:24d3)
[init_stage2(1:1)]: PCI [0000:00:1f.05] (8086:24d5)
[init_stage2(1:1)]: MBVGADevice address=P0xf0000000, pitch=5120, width=1280, height=1024
[init_stage2(1:1)]: RTL8139: Found @ PCI [0000:01:03.00]
[init_stage2(1:1)]: RTL8139: port base: IO a400
[init_stage2(1:1)]: RTL8139: Interrupt line: 5
[init_stage2(1:1)]: RTL8139: RX buffer: P0x00700000
[init_stage2(1:1)]: RTL8139: TX buffer 0: P0x0070a000
[init_stage2(1:1)]: RTL8139: TX buffer 1: P0x0070b000
[init_stage2(1:1)]: RTL8139: TX buffer 2: P0x0070c000
[init_stage2(1:1)]: RTL8139: TX buffer 3: P0x0070d000
[init_stage2(1:1)]: RTL8139: MAC address: 00:e0:4c:dc:be:25
[init_stage2(1:1)]: Syscall: int 0x82 handler installed
[init_stage2(1:1)]: SB16: sb not ready
[init_stage2(1:1)]: PATAChannel: PATA Controller found, ID (8086:24d1)
[init_stage2(1:1)]: PATAChannel: Bus master IDE: IO f000
[init_stage2(1:1)]: PATAChannel: Name=HDS728080PLA380, C/H/Spt=255/16/63
[init_stage2(1:1)]: MBRPartitionTable::initialize: mbr_signature=0xaa55
[init_stage2(1:1)]: MBR Partitioned Storage Detected!
[init_stage2(1:1)]: MBRPartitionTable::partition: status=0x80 offset=0x3e
[init_stage2(1:1)]: MBRPartitionTable::partition: found partition index=1 type=83
[init_stage2(1:1)]: VFS: mounted root from Ext2FS (device:100,0 (DiskPartition))
[init_stage2(1:1)]: Loading kernel symbol table...
[init_stage2(1:1)]: exec(/bin/SystemServer): Using program interpreter /usr/lib/Loader.so
[NetworkTask(5:5)]: NetworkTask: RTL8139NetworkAdapter network adapter found: hw=00:e0:4c:dc:be:25 address=10.0.2.15 netmask=255.255.255.0 gateway=10.0.2.2
[NetworkTask(5:5)]: NetworkTask: LoopbackAdapter network adapter found: hw=13:55:02:09:55:aa address=127.0.0.1 netmask=255.0.0.0 gateway=0.0.0.0
[NetworkTask(5:5)]: NetworkTask: Enter main loop.
[SystemServer(6:6)]: exec(/bin/mount): Using program interpreter /usr/lib/Loader.so
[mount(6:6)]: mount bind: source fd 4 @ /dev
[mount(6:6)]: VFS: Bind-mounting /dev at /dev
[mount(6:6)]: mount bind: source fd 5 @ /bin
[mount(6:6)]: VFS: Bind-mounting /bin at /bin
[mount(6:6)]: mount bind: source fd 6 @ /etc
[mount(6:6)]: VFS: Bind-mounting /etc at /etc
[mount(6:6)]: mount bind: source fd 7 @ /home
[mount(6:6)]: VFS: Bind-mounting /home at /home
[mount(6:6)]: mount bind: source fd 8 @ /root
[mount(6:6)]: VFS: Bind-mounting /root at /root
[mount(6:6)]: mount bind: source fd 9 @ /var
[mount(6:6)]: VFS: Bind-mounting /var at /var
[mount(6:6)]: mount proc @ /proc
[mount(6:6)]: VFS: Mounting ProcFS at /proc (inode: 1:25344) with flags 4
[mount(6:6)]: mount: successfully mounted /proc
[mount(6:6)]: mount devpts @ /dev/pts
[mount(6:6)]: VFS: Mounting DevPtsFS at /dev/pts (inode: 1:25351) with flags 22
[mount(6:6)]: mount: successfully mounted /dev/pts
[mount(6:6)]: mount tmp @ /tmp
[mount(6:6)]: VFS: Mounting TmpFS at /tmp (inode: 1:25346) with flags 5
[mount(6:6)]: mount: successfully mounted /tmp
[SystemServer(9:9)]: exec(/bin/ResourceGraph.MenuApplet): Using program interpreter /usr/lib/Loader.so
[SystemServer(8:8)]: exec(/bin/Clock.MenuApplet): Using program interpreter /usr/lib/Loader.so
[SystemServer(7:7)]: exec(/bin/AudioServer): Using program interpreter /usr/lib/Loader.so
[SystemServer(12:12)]: exec(/bin/FileManager): Using program interpreter /usr/lib/Loader.so
[SystemServer(18:18)]: exec(/bin/CrashDaemon): Using program interpreter /usr/lib/Loader.so
[SystemServer(11:11)]: exec(/bin/Taskbar): Using program interpreter /usr/lib/Loader.so
[SystemServer(16:16)]: exec(/bin/DHCPClient): Using program interpreter /usr/lib/Loader.so
[SystemServer(19:19)]: exec(/bin/SystemMenu): Using program interpreter /usr/lib/Loader.so
[SystemServer(21:21)]: exec(/bin/Terminal): Using program interpreter /usr/lib/Loader.so
[SystemServer(15:15)]: exec(/bin/WindowServer): Using program interpreter /usr/lib/Loader.so
[SystemServer(20:20)]: exec(/bin/Audio.MenuApplet): Using program interpreter /usr/lib/Loader.so
[SystemServer(10:10)]: exec(/bin/UserName.MenuApplet): Using program interpreter /usr/lib/Loader.so
[SystemServer(17:17)]: exec(/bin/Clipboard): Using program interpreter /usr/lib/Loader.so
[SystemServer(14:14)]: exec(/bin/ClipboardHistory.MenuApplet): Using program interpreter /usr/lib/Loader.so
[WindowServer(15:15)]: MBVGADevice: mmap with size 5242880 at V0x027c9000
[SystemServer(13:13)]: exec(/bin/ResourceGraph.MenuApplet): Using program interpreter /usr/lib/Loader.so
[Terminal(23:23)]: exec(/bin/sh): Using program interpreter /usr/lib/Loader.so
[Terminal(24:24)]: exec(/bin/utmpupdate): Using program interpreter /usr/lib/Loader.so
[SyncTask(2:2)]: Ext2FS: Flushed 4 blocks to disk
danboid commented 3 years ago

Is this actually a bug with NetworkTask rather than DHCPClient? @alimpfard ?

alimpfard commented 3 years ago

No further logs? that seems odd it should go something like this if it works:

[DHCPClient(16:16)]: sending as udp packet from 10.0.2.15:34537 to 255.255.255.255:67!
DHCPClient(16): Received 548 bytes
DHCPClient(16): We were offered 10.0.2.15 for 2152792320
DHCPClient(16): Leasing the IP 10.0.2.15 for adapter e1k0
[DHCPClient(16:16)]: sending as udp packet from 10.0.2.15:44768 to 255.255.255.255:67!
DHCPClient(16): Received 548 bytes
danboid commented 3 years ago

Yep, that's all I got after booting.

alimpfard commented 3 years ago

I do see that SystemServer starts DHCPClient, could you see what it says with this patch?

diff --git a/Services/DHCPClient/DHCPv4Client.cpp b/Services/DHCPClient/DHCPv4Client.cpp
index 855fe6846..702fe53ee 100644
--- a/Services/DHCPClient/DHCPv4Client.cpp
+++ b/Services/DHCPClient/DHCPv4Client.cpp
@@ -32,7 +32,7 @@
 #include <LibCore/Timer.h>
 #include <stdio.h>

-//#define DHCPV4CLIENT_DEBUG
+#define DHCPV4CLIENT_DEBUG

 static void send(const InterfaceDescriptor& iface, const DHCPv4Packet& packet, Core::Object*)
 {
diff --git a/Services/DHCPClient/main.cpp b/Services/DHCPClient/main.cpp
index 33e7d7e2e..34c8d9383 100644
--- a/Services/DHCPClient/main.cpp
+++ b/Services/DHCPClient/main.cpp
@@ -91,6 +91,10 @@ int main(int argc, char** argv)
         ifnames.append({ name, mac_from_string(mac) });
     });

+    dbgln("Interfaces:");
+    for (auto& ifname : ifnames)
+        dbgln("- {}", ifname.m_ifname);
+
     auto client = DHCPv4Client::construct(move(ifnames));

     if (pledge("stdio inet", nullptr) < 0) {
danboid commented 3 years ago

Will test this ASAP but my previous disclaimer still applies.

I need to get my hands on some compat hardware that I can keep at home.

danboid commented 3 years ago

I should get to test this again on Thursday.

danboid commented 3 years ago

I had another quick go at this today with the latest but the issue persists.

What is notable in this screenshot is the NetworkTask ARP table entry as it reads 146.87.x.x - this is an address in the range that the IP address should've been assigned from instead of 10.0.x.x. Also, note there is an extra 0 appended to the device name of the NIC so it reads rtl81390 instead of rtl8139.

@alimpfard @deoxxa

IMG_20210401_143904

alimpfard commented 3 years ago

I guess we should make the dhcp client log to stderr too, otherwise there's no reasonable way to debug it in cases like this 🤔

gunnarbeutner commented 3 years ago

Just a hunch, but does your DHCP server send responses to the network broadcast address (e.g. 192.168.3.255) rather than the global broadcast address (255.255.255.255)? Serenity doesn't seem to think that the former are directed at it:

[#0 NetworkTask(7:7)]: handle_udp: this packet is not for me, it's for 192.168.3.255

Maybe try running tcpdumpto see what your DHCP server sends.

gunnarbeutner commented 3 years ago

Actually, I suspect this will be fixed by #6669. :partying_face:

danboid commented 3 years ago

Thanks @gunnarbeutner , hopefully!

I will test this out as soon as I can.