canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.34k stars 930 forks source link

Unable to Connect to LXD REST API - Port :8443 Not Listening #13004

Closed whywaita closed 6 months ago

whywaita commented 7 months ago

Required information

Name    Version       Rev    Tracking       Publisher   Notes
core20  20231123      2105   latest/stable  canonical✓  base,disabled
core20  20240111      2182   latest/stable  canonical✓  base
core22  20231123      1033   latest/stable  canonical✓  base,disabled
core22  20240111      1122   latest/stable  canonical✓  base
lxd     5.20-a8d6c52  26955  latest/stable  canonical✓  disabled
lxd     5.20-f3dd836  27049  latest/stable  canonical✓  -
snapd   2.60.4        20290  latest/stable  canonical✓  snapd,disabled
snapd   2.61.1        20671  latest/stable  canonical✓  snapd

Issue description

We are using the LXD REST API on port :8443. However, we have encountered an issue where the LXD daemon has stopped listening on port 8443.

Expected Behavior

$ sudo ss -antp | grep LISTEN | grep 8443
LISTEN    0      16384                        *:8443                          *:*            users:(("lxd",pid=340433,fd=22))
$ ps auxfww | grep [3]40433
root      340433 10.3  0.1 8434504 245884 ?      Sl   Feb29 174:35  \_ lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd

Current Behavior

$ sudo ss -antp | grep LISTEN | grep 8443
$
-> Port 8443 is not listening
$ ps auxfww | grep "[l]xd --logfile"
root      978614  9.9  0.1 8271916 283820 ?      Sl   Feb10 2952:47  \_ lxd --logfile /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
-> However, the LXD daemon process is still running

Context

We automatically create LXD containers for GitHub Actions runners. Occasionally, our daemon (implemented in Go) is unable to connect to LXD. This is the situation we have discovered. We have found that restarting snap.lxd.daemon seems to resolve the issue, as evidenced by the following logs:

Mar 01 16:49:46 myshoes-lxd-010 systemd[1]: Stopping Service for snap application lxd.daemon...
Mar 01 16:49:46 myshoes-lxd-010 lxd.daemon[2101994]: => Stop reason is: host shutdown
Mar 01 16:49:46 myshoes-lxd-010 lxd.daemon[2101994]: => Stopping LXD (with container shutdown)
Mar 01 16:50:16 myshoes-lxd-010 lxd.daemon[978614]: time="2024-03-01T16:50:16+09:00" level=warning msg="Failed shutting down instance, forcefully stopping" err="Failed shutting>
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[978614]: time="2024-03-01T16:50:25+09:00" level=error msg="Failed to cleanly shutdown daemon" err="Shutdown endpoints: close tcp [::]>
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[978614]: Error: Shutdown endpoints: close tcp [::]:8443: use of closed network connection
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[978441]: => LXD failed with return code 1
Mar 01 16:50:25 myshoes-lxd-010 systemd[1]: snap.lxd.daemon.service: Main process exited, code=exited, status=1/FAILURE
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[2101994]: ==> Stopped LXD
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[2101994]: => Stopping LXCFS
Mar 01 16:50:25 myshoes-lxd-010 lxd.daemon[1230]: Running destructor lxcfs_exit
Mar 01 16:50:26 myshoes-lxd-010 lxd.daemon[2101994]: ==> Stopped LXCFS
Mar 01 16:50:26 myshoes-lxd-010 lxd.daemon[2101994]: => Cleaning up PID files
Mar 01 16:50:26 myshoes-lxd-010 lxd.daemon[2101994]: => Cleaning up namespaces
Mar 01 16:50:26 myshoes-lxd-010 lxd.daemon[2101994]: => All done
Mar 01 16:50:26 myshoes-lxd-010 systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.

Any advice or guidance on how to resolve this issue would be greatly appreciated.

Steps to reproduce

We can't find it yet.

Information to attach

tomponline commented 7 months ago

We normally encourage support requests to be posted over at https://discourse.ubuntu.com/c/lxd/support/149

In this case it does not appear you have been able to identify reproducer steps to make the issue occur. Is this correct?

tomponline commented 6 months ago

Closing due to lack of response and because this would be better posted on the forum.