Closed rkwesk closed 5 years ago
Richard, AFAIK we can't use network manager's "connection sharing" feature, as it starts a DHCP server that doesn't offer a "boot filename". That's why in LTSP5 we provided an /etc/network/if-up.d/sch-scripts file that set up IP forwarding.
So, AFAIK this is still necessary, right? Or did you find a way around it where the internal NIC clients actually netboot and also have Internet access?
Not yet Alki. The client not finding an ip is my stumbling block now. Richard
On Saturday, August 24, 2019, 6:31:09 PM UTC, Alkis Georgopoulos <notifications@github.com> wrote:
Richard, AFAIK we can't use network manager's "connection sharing" feature, as it starts a DHCP server that doesn't offer a "boot filename". That's why in LTSP5 we provided an /etc/network/if-up.d/sch-scripts file that set up IP forwarding.
So, AFAIK this is still necessary, right? Or did you find a way around it where the internal NIC clients actually netboot and also have Internet access?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
I changed the lan ip with network manager but the client still cannot get an ip.
I even reran ltsp initrd and then rebooted the server, but still the same log that the client cannot get an ip.
root@buster64dualnicltsp19:~# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp1s5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:1a:4d:25:fc:65 brd ff:ff:ff:ff:ff:ff inet 192.198.67.1/24 brd 192.198.67.255 scope global noprefixroute enp1s5 valid_lft forever preferred_lft forever inet6 fe80::9a15:7128:7759:3581/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: enp1s2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000 link/ether 00:01:02:05:e7:43 brd ff:ff:ff:ff:ff:ff inet 10.72.251.201/24 brd 10.72.251.255 scope global dynamic noprefixroute enp1s2 valid_lft 1813991sec preferred_lft 1813991sec inet6 2a02:587:2d24:1900:81b7:1ea7:c531:bbf7/64 scope global dynamic noprefixroute valid_lft 58855sec preferred_lft 58855sec inet6 fe80::11bd:1d2f:bf11:d9cf/64 scope link noprefixroute valid_lft forever preferred_lft forever
and
Aug 24 22:45:32 buster64dualnicltsp19 dnsmasq-dhcp[458]: no address range available for DHCP request via enp1s5 Aug 24 22:45:34 buster64dualnicltsp19 dnsmasq-dhcp[458]: no address range available for DHCP request via enp1s5 Aug 24 22:45:38 buster64dualnicltsp19 dnsmasq-dhcp[458]: no address range available for DHCP request via enp1s5 Aug 24 22:45:46 buster64dualnicltsp19 dnsmasq-dhcp[458]: no address range available for DHCP request via enp1s5
Richard
Sorry I put this in wrong issue but rewrote it in issue 12.
However, in answer to your remark here in issue 11 I want to say that with ltsp5 the 2 nic scenario works in Buster without using sch-scripts.
If you say that network manager cannot do a share with other computers in ltsp19 what about manually setting IPForward in systemd and manually adding iptable rules. How else will fat clients get to the Internet from another lan?
Richard
Nothing changed from LTSP5 to LTSP19 wrt NAT etc. So if you think you had it working in LTSP5, it should work the same way in LTSP19.
What I'm saying is that AFAIK the method you describe for LTSP5 with network manager, without sch-scripts etc, shouldn't work, as network manager doesn't send a boot filename for the clients to boot with. So some combination of the things you did might have made it work, but not just network manager.
The relevant steps I followed specifically for the two nic scenario are:
Steps 1 through 5 and then 6(dual) through 9(dual) and finally 12 through 20 on https://wiki.debian.org/LTSP/Howto
I tested the server/client scenario 4 times and noticed that it worked 3 times. The one time it did not work the client could ping an external numerical ip address but not a named url so dns was not working. The workaround then was to restart network manager which allowed dns to work. My guess is that systemd (which defaults to ipforward disabled) was somehow activated after network manager whereas on the other times network manager was activated afterwards (race condition.)
The scenario is broken down in so many steps to ensure that a novice user can accomplish it. However, if you want to look I think you will agree that nothing else except network manager is dealing with ipforwading and nat iptabling.
Yes, network manager does not send a boot filename for the clients to boot with. NBD and NFS do that.
Richard
Richard, I mean this:
In dual NIC mode there's no proxyDHCP. The LTSP server provides the clients with an IP address and a boot filename. The boot filename in LTSP5 was "pxelinux.0", now it's more complicated as it involves 3 different files for BIOS/UEFI/IPXE.
The boot filename is not related to NFS or NBD. It's the first thing that the clients get via TFTP when they boot. The LTSP file that sends that filename is /etc/dnsmasq.d/ltsp-server-dnsmasq.conf in LTSP5, and /etc/dnsmasq.d/ltsp-dnsmasq.conf in newer versions.
Now to the problem. When you enable connection sharing in network manager, you instruct network manager to run a child dnsmasq process of its own! This process then is yet another DHCP server, which gives IP addresses to the clients, but without any special dnsmasq.conf configuration to give them the pxelinux.0 etc filename. This is why the clients fail to boot some times in your tests.
So why is it working most of the times? Because that child dnsmasq fails to run, as the LTSP dnsmasq is already running in that interface.
That means that using network manager connection sharing isn't an appropriate solution for LTSP. It only works if LTSP dnsmasq is started before network-manager dnsmasq (i.e. race condition), and even then, it only works because network-manager has a bug and doesn't check if its child dnsmasq started successfully or not.
Network manager connection sharing could only be used for LTSP if it had an option to "not run dnsmasq as the user already runs dnsmasq there"; but I don't think it has such an option.
If we agree that we should avoid relying on network manager and its additional dnsmasq service for connection sharing, since we already have one dnsmasq running that may conflict with the other one,
then I think we can move on and file a new issue and start the implementation of NAT inside the new LTSP code base?
I.e. ltsp dnsmasq --real-dhcp=1
(which is the default) would symlink a script in /etc/NetworkManager/dispatcher.d/ that would do NAT on 192.168.67.1, similar to what sch-scripts do.
Unfortunately this will only work when users use network manager and not e.g. ifupdown (/etc/network/interfaces), but ifupdown is getting deprecated and systemd-networkd doesn't (yet?) provide a method to run code when a network connection is established.
I filed issue #13 about this; Richard I'll close this issue now as I think it's resolved, but we can still chat in it even if it's closed and of course you can reopen it if you think it still needs resolution.
Just to clarify: Using ltsp5 with a server with two nics, dnsmasq.conf was runnung dhcp-proxy on the wan but proper dhcp on the lan. In contrast using ltsp19 with a server with two nics dnsmasq does not run dhcp-proxy on the wan?
AFAIK in ltsp5 we looked at /etc/NetworkManager/NetworkManager.conf to see if there was a specific line with the dns= key we commented it out which was enough because the default was for network manager not to use dnsmasq.basic. Now I see you saying network manager will use dnsmasq.basic anyway.
Just to clarify: Using ltsp5 with a server with two nics, dnsmasq.conf was runnung dhcp-proxy on the wan but proper dhcp on the lan. In contrast using ltsp19 with a server with two nics dnsmasq does not run dhcp-proxy on the wan?
From man ltsp dnsmasq
:
-p, --proxy-dhcp=0|1
Enable or disable the proxy DHCP service. Defaults to 1. Proxy DHCP means that the LTSP
server sends the boot filename, but it leaves the IP leasing to an external DHCP
server, for example a router or pfsense or a Windows DHCP server. It´s the easiest way
to set up LTSP, as it only requires a single NIC with no static IP, no need to rewire
switches etc.
-r, --real-dhcp=0|1
Enable or disable the real DHCP service. Defaults to 1. In dual NIC setups, you only
need to configure the internal NIC to a static IP of 192.168.67.1; LTSP will try to
autodetect everything else. The real DHCP service doesn´t take effect if your IP isn´t
192.168.67.x, so there´s no need to disable it in single NIC setups unless you want to
run isc-dhcp-server on the LTSP server.
I.e. both of them are enabled by default, and there are options to disable them.
AFAIK in ltsp5 we looked at /etc/NetworkManager/NetworkManager.conf to see if there was a specific line with the dns= key we commented it out which was enough because the default was for network manager not to use dnsmasq.basic. Now I see you saying network manager will use dnsmasq.basic anyway.
There are 3 dnsmasq instances involved:
I was talking about "2" conflicting with "1" there, which we don't want.
Just for clarity, these are the command lines of "1" and "2" as they show up in ps faux | grep dnsmasq
. E.g. in the current Debian how-to, when "1" runs first, things are almost OK, but when "2" runs first, clients won't be able to boot. We don't care about "3", Ubuntu stopped shipping that.
This is the main dnsmasq that LTSP configures, "1":
dnsmasq 1340 0.0 0.0 18012 340 ? S 23:17 0:00 /usr/sbin/dnsmasq -x /run/dnsmasq/dnsmasq.pid -u dnsmasq -7 /etc/dnsmasq.d,.dpkg-dist,.dpkg-old,.dpkg-new --local-service --trust-anchor=.,20326,8,2,e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
This is the auxiliary dnsmasq for connection sharing, "2", that conflicts with the main one:
nobody 1236 0.0 0.1 18124 3848 ? S 23:13 0:00 /usr/sbin/dnsmasq --conf-file=/dev/null --no-hosts --keep-in-foreground --bind-interfaces --except-interface=lo --clear-on-reload --strict-order --listen-address=192.168.67.1 --dhcp-range=192.168.67.10,192.168.67.254,60m --dhcp-lease-max=50 --dhcp-leasefile=/var/lib/NetworkManager/dnsmasq-enp0s3.leases --pid-file=/run/nm-dnsmasq-enp0s3.pid --conf-dir=/etc/NetworkManager/dnsmasq-shared.d
Older server with bios and two nics, enp1s2 for wan and enp1s5 for lan.
Clean install Debian Buster 64 bit. Debian installer expert install basic system only. Reboot and install xorg mate wget lightdm network-manager-gnome firefox-esr rsync. Comment out lines in /etc/network/interfaces of ethernet interfaces so that network manager controls both wired connections. Reboot again.
From previous ltsp5 experience I thought I needed to: Use nm-connection-editor to change enp1s5 to share with other computers and static ip 192.168.67.1
However, this caused an error later:
Follow strictly the install steps
Now to the issue:
As already discussed I edit line 33:
But this time I ran up against a new error:
I solved this by going back to nm-connection-editor and changing the lan connection to manual.
Then I reboot the server and dnsmasq.service starts without error.
The rest of the steps for the install do not report any error.
Richard