GNS3 / gns3-server

GNS3 server
GNU General Public License v3.0
768 stars 258 forks source link

DHCP client in Docker VM fails if DHCP server doesn't respond during first 3 discovers #2357

Closed b-ehlers closed 4 months ago

b-ehlers commented 4 months ago

Describe the bug Since PR #2355 in a Docker VM the DHCP server must respond within the first 3 discovers, otherwise the DHCP client fails. This was not the case before that PR was merged.

GNS3 version and operating system (please complete the following information):

To Reproduce Steps to reproduce the behavior:

  1. Add a Docker VM and connect it to the NAT cloud
  2. Suspend the link to the NAT cloud
  3. Start the Docker VM and connect to the console
  4. The console shows, that the udhcpc process fails after 3 discovers
alpine-1 console is now available... Press RETURN to get started.
udhcpc: started, v1.35.0
udhcpc: broadcasting discover
udhcpc: broadcasting discover
udhcpc: broadcasting discover
udhcpc: no lease, failing
/ # ps
PID   USER     TIME  COMMAND
    1 root      0:00 /bin/sh
   19 root      0:00 /gns3/bin/busybox sh -c while true; do TERM=vt100 /gns3/bin/busybox sh; done
   34 root      0:00 /gns3/bin/busybox sh
   37 root      0:00 ps

Screenshots or videos project

Additional context

The reason is, that PR #2355 doesn't copy the whole resources directory hierarchy, only the contents of the top directory is copied (and the busybox program is installed).

behlers@iMac:~$ tree ~/GNS3/venv/lib/python3.11/site-packages/gns3server/compute/docker/resources
/home/behlers/GNS3/venv/lib/python3.11/site-packages/gns3server/compute/docker/resources
├── bin
│   └── udhcpc
├── etc
│   └── udhcpc
│       └── default.script
├── init.sh
└── run-cmd.sh

4 directories, 4 files
behlers@iMac:~$ tree ~/.local/share/GNS3/docker/resources
/home/behlers/.local/share/GNS3/docker/resources
├── bin
│   └── busybox
├── etc
├── init.sh
└── run-cmd.sh

3 directories, 3 files

The files bin/udhcpc and etc/udhcpc/default.script are missing.

After copying the whole directory hierarchy to the writable location (cp -a ~/GNS3/venv/lib/python3.11/site-packages/gns3server/compute/docker/resources ~/.local/share/GNS3/docker/) the DHCP client works as expected.

alpine-1 console is now available... Press RETURN to get started.
udhcpc: started, v1.35.0
udhcpc: broadcasting discover
udhcpc: broadcasting discover
udhcpc: broadcasting discover
udhcpc failed to get a DHCP lease
udhcpc: no lease, forking to background
/ # ps
PID   USER     TIME  COMMAND
    1 root      0:00 /bin/sh
   32 root      0:00 /gns3/bin/busybox sh -c while true; do TERM=vt100 /gns3/bin/busybox sh; done
   38 root      0:00 /gns3/bin/busybox sh
   51 root      0:00 /tmp/gns3/bin/udhcpc -s /gns3/etc/udhcpc/default.script -t 3 -T 2 -A 1 -b -R -p /var/run/udhcpc.eth0.pid -i eth0 -x hostname:alpine-1
   55 root      0:00 ps
b-ehlers commented 4 months ago

As PR #2355 is a backport from v3.0, version 3.0 may be affected as well. But I haven't tested this.

b-ehlers commented 4 months ago

Additionally PR #2355 won't assign IP addresses obtained from DHCP to the interface, very similar to issue #2159.

alpine-1 console is now available... Press RETURN to get started.
udhcpc: started, v1.35.0
udhcpc: broadcasting discover
udhcpc: broadcasting discover
udhcpc: broadcasting select for 172.30.0.167, server 172.30.0.254
udhcpc: lease of 172.30.0.167 obtained from 172.30.0.254, lease time 3600
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
7: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN qlen 1000
    link/ether f6:3e:46:1b:22:5c brd ff:ff:ff:ff:ff:ff
    inet6 fe80::f43e:46ff:fe1b:225c/64 scope link 
       valid_lft forever preferred_lft forever

This is also fixed, when the missing files bin/udhcpc and etc/udhcpc/default.script are copied to the writable location.

grossmj commented 4 months ago

Yes, the resources are not recursively copied to the writable location. I will push a fix soon. Thanks for catching that.

grossmj commented 4 months ago

@b-ehlers

The PR should fix the issue. Please can you check on your side? Thanks 👍

b-ehlers commented 4 months ago

@b-ehlers

The PR should fix the issue. Please can you check on your side? Thanks 👍

Looks good. I merged this PR in my local installation and deleted the writable location (~/.local/share/GNS3/docker/). Then I tested both cases, DHCP with suspended link to NAT and DHCP with working link to NAT. Both tests were successful.

Furthermore the writable location is a full copy of the resource directory.

Good job.