canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.32k stars 926 forks source link

Can't lxc list, no unix.socket connection refused #5423

Closed 19wolf closed 5 years ago

19wolf commented 5 years ago

Required information

$ uname -a
 Linux nephele 4.18.0-041800-generic #201808122131 SMP Sun Aug 12 21:33:20 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ sudo lsb_release -a
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.1 LTS
Release:    18.04
Codename:   bionic

$ sudo snap info lxd | grep installed
installed:       3.9                    (9919) 54MB -

$ sudo lxd info
WARN[01-18|14:46:58] CGroup memory swap accounting is disabled, swap limits will be ignored. 
EROR[01-18|14:46:58] Failed to start the daemon: Listen to cluster address: listen tcp 0.0.0.0:8443: bind: address already in use 
Error: Listen to cluster address: listen tcp 0.0.0.0:8443: bind: address already in use

Issue description

Most lxc commands fail with Error: Get http://unix.socket/1.0: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused. The exception is

$ sudo systemctl restart snap.lxd.daemon.unix.socket && sudo lxc list
Error: Get http://unix.socket/1.0: read unix @->/var/snap/lxd/common/lxd/unix.socket: read: connection reset by peer

...and then it goes back to the other error.

Steps to reproduce

I haven't really been playing with my server a lot recently, but the only thing I can think of that may have caused this is I tried, unsuccessfully, to connect a new computer to make a cluster. I didn't change anything on this side though. Before running through various steps from other threads, lxc list and the like would just hang indefinitely, if that's relevant.

stgraber commented 5 years ago

So you seem to be doing a few weird things here. You're running lxd info rather than lxc info which caused some confusion, you're also restarting the unix socket unit without restarting the main unit, which may cause some pretty weird situations depending on what systemd thinks that means.

So a few things to get a baseline of what the system looks like:

Then with that, we should be able to figure out what's going on and get things back to normal.

19wolf commented 5 years ago

sudo lxc info gives the same error as any other lxc command: Error: Get http://unix.socket/1.0: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused and sometimes Error: Get http://unix.socket/1.0: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: no such file or directory

https://pastebin.com/uP0uLg6w

I should note I tried sudo snap revert lxd a bit after submitting the report. It failed.

$ sudo snap revert lxd
error: cannot perform the following tasks:
- Stop snap "lxd" services ([start snap.lxd.activate.service] failed with exit status 1: Job for snap.lxd.activate.service failed because the control process exited with error code.
See "systemctl status snap.lxd.activate.service" and "journalctl -xe" for details.
)
- Start snap "lxd" (9886) services ([start snap.lxd.activate.service] failed with exit status 1: Job for snap.lxd.activate.service failed because the control process exited with error code.
See "systemctl status snap.lxd.activate.service" and "journalctl -xe" for details.
)
stgraber commented 5 years ago

You have something listening on port 8443 which is preventing LXD from binding that port and starting.

Can you show sudo netstat -lnp?

19wolf commented 5 years ago
$ sudo netstat -lnp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:548             0.0.0.0:*               LISTEN      57037/afpd          
tcp        0      0 0.0.0.0:139             0.0.0.0:*               LISTEN      58270/smbd          
tcp        0      0 127.0.0.1:6379          0.0.0.0:*               LISTEN      30145/redis-server  
tcp        0      0 0.0.0.0:9419            0.0.0.0:*               LISTEN      62879/mfsmaster     
tcp        0      0 0.0.0.0:9420            0.0.0.0:*               LISTEN      62879/mfsmaster     
tcp        0      0 0.0.0.0:9421            0.0.0.0:*               LISTEN      62879/mfsmaster     
tcp        0      0 0.0.0.0:5901            0.0.0.0:*               LISTEN      50551/Xtightvnc     
tcp        0      0 127.0.0.1:51981         0.0.0.0:*               LISTEN      20454/mfsmount      
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      40384/rpcbind       
tcp        0      0 0.0.0.0:9424            0.0.0.0:*               LISTEN      62879/mfsmaster     
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      10976/nginx: master 
tcp        0      0 0.0.0.0:6001            0.0.0.0:*               LISTEN      50551/Xtightvnc     
tcp        0      0 0.0.0.0:9425            0.0.0.0:*               LISTEN      29613/python        
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      18779/systemd-resol 
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      61861/sshd          
tcp        0      0 0.0.0.0:3000            0.0.0.0:*               LISTEN      56201/ntopng        
tcp        0      0 0.0.0.0:9500            0.0.0.0:*               LISTEN      61702/mfschunkserve 
tcp        0      0 0.0.0.0:445             0.0.0.0:*               LISTEN      58270/smbd          
tcp        0      0 127.0.0.1:8125          0.0.0.0:*               LISTEN      35695/netdata       
tcp        0      0 0.0.0.0:9501            0.0.0.0:*               LISTEN      61724/mfschunkserve 
tcp        0      0 0.0.0.0:9502            0.0.0.0:*               LISTEN      61746/mfschunkserve 
tcp        0      0 0.0.0.0:19999           0.0.0.0:*               LISTEN      35695/netdata       
tcp        0      0 0.0.0.0:9503            0.0.0.0:*               LISTEN      61768/mfschunkserve 
tcp        0      0 0.0.0.0:9504            0.0.0.0:*               LISTEN      33834/mfschunkserve 
tcp6       0      0 ::1:6379                :::*                    LISTEN      30145/redis-server  
tcp6       0      0 :::139                  :::*                    LISTEN      58270/smbd          
tcp6       0      0 :::111                  :::*                    LISTEN      40384/rpcbind       
tcp6       0      0 :::80                   :::*                    LISTEN      10976/nginx: master 
tcp6       0      0 :::4949                 :::*                    LISTEN      50007/perl          
tcp6       0      0 :::22                   :::*                    LISTEN      61861/sshd          
tcp6       0      0 ::1:4700                :::*                    LISTEN      57031/cnid_metad    
tcp6       0      0 ::1:8125                :::*                    LISTEN      35695/netdata       
tcp6       0      0 :::445                  :::*                    LISTEN      58270/smbd          
tcp6       0      0 :::19999                :::*                    LISTEN      35695/netdata       
udp        0      0 169.254.92.118:49013    0.0.0.0:*                           56201/ntopng        
udp        0      0 127.0.0.1:8125          0.0.0.0:*                           35695/netdata       
udp        0      0 127.0.0.53:53           0.0.0.0:*                           18779/systemd-resol 
udp        0      0 192.168.1.1:68          0.0.0.0:*                           18757/systemd-netwo 
udp        0      0 0.0.0.0:111             0.0.0.0:*                           40384/rpcbind       
udp        0      0 169.254.255.255:137     0.0.0.0:*                           58412/nmbd          
udp        0      0 169.254.122.203:137     0.0.0.0:*                           58412/nmbd          
udp        0      0 172.19.92.255:137       0.0.0.0:*                           58412/nmbd          
udp        0      0 172.19.92.1:137         0.0.0.0:*                           58412/nmbd          
udp        0      0 192.168.7.255:137       0.0.0.0:*                           58412/nmbd          
udp        0      0 192.168.1.1:137         0.0.0.0:*                           58412/nmbd          
udp        0      0 0.0.0.0:137             0.0.0.0:*                           58412/nmbd          
udp        0      0 169.254.255.255:138     0.0.0.0:*                           58412/nmbd          
udp        0      0 169.254.122.203:138     0.0.0.0:*                           58412/nmbd          
udp        0      0 172.19.92.255:138       0.0.0.0:*                           58412/nmbd          
udp        0      0 172.19.92.1:138         0.0.0.0:*                           58412/nmbd          
udp        0      0 192.168.7.255:138       0.0.0.0:*                           58412/nmbd          
udp        0      0 192.168.1.1:138         0.0.0.0:*                           58412/nmbd          
udp        0      0 0.0.0.0:138             0.0.0.0:*                           58412/nmbd          
udp        0      0 0.0.0.0:704             0.0.0.0:*                           40384/rpcbind       
udp        0      0 169.254.151.78:33689    0.0.0.0:*                           56201/ntopng        
udp        0      0 169.254.115.177:52247   0.0.0.0:*                           56201/ntopng        
udp        0      0 0.0.0.0:44124           0.0.0.0:*                           56983/avahi-daemon: 
udp        0      0 192.168.1.1:44469       0.0.0.0:*                           56201/ntopng        
udp        0      0 169.254.55.232:54341    0.0.0.0:*                           56201/ntopng        
udp        0      0 0.0.0.0:5353            0.0.0.0:*                           56983/avahi-daemon: 
udp        0      0 169.254.156.232:54662   0.0.0.0:*                           56201/ntopng        
udp        0      0 172.19.92.1:48023       0.0.0.0:*                           56201/ntopng        
udp        0      0 0.0.0.0:40026           0.0.0.0:*                           56201/ntopng        
udp6       0      0 ::1:8125                :::*                                35695/netdata       
udp6       0      0 :::111                  :::*                                40384/rpcbind       
udp6       0      0 fe80::9618:82ff:fe3:546 :::*                                18757/systemd-netwo 
udp6       0      0 :::704                  :::*                                40384/rpcbind       
udp6       0      0 :::5353                 :::*                                56983/avahi-daemon: 
udp6       0      0 :::46419                :::*                                56983/avahi-daemon: 
raw6       0      0 :::58                   :::*                    7           18757/systemd-netwo 
Active UNIX domain sockets (only servers)
Proto RefCnt Flags       Type       State         I-Node   PID/Program name     Path
unix  2      [ ACC ]     STREAM     LISTENING     56042022 40364/ssh-agent      /tmp/ssh-R8iOjR3gisr6/agent.40363
unix  2      [ ACC ]     STREAM     LISTENING     19463    1/systemd            /run/lvm/lvmetad.socket
unix  2      [ ACC ]     STREAM     LISTENING     56075155 45038/ssh-agent      /tmp/ssh-FkzqqqopoqiC/agent.45029
unix  2      [ ACC ]     STREAM     LISTENING     56107550 50610/menu-cached    /run/user/1000/menu-cached-:1
unix  2      [ ACC ]     SEQPACKET  LISTENING     19487    1/systemd            /run/udev/control
unix  2      [ ACC ]     STREAM     LISTENING     28759    1471/irqbalance      @irqbalance1471.sock
unix  2      [ ACC ]     STREAM     LISTENING     38513441 65504/php-fpm: mast  /run/php/php7.2-fpm.sock
unix  2      [ ACC ]     STREAM     LISTENING     19490    1/systemd            /run/systemd/journal/stdout
unix  2      [ ACC ]     STREAM     LISTENING     56109104 50644/pulseaudio     /run/user/1000/pulse/native
unix  2      [ ACC ]     STREAM     LISTENING     254547   38368/systemd        /run/user/1000/gnupg/S.gpg-agent.browser
unix  2      [ ACC ]     STREAM     LISTENING     254549   38368/systemd        /run/user/1000/gnupg/S.gpg-agent.ssh
unix  2      [ ACC ]     STREAM     LISTENING     254551   38368/systemd        /run/user/1000/gnupg/S.gpg-agent
unix  2      [ ACC ]     STREAM     LISTENING     254553   38368/systemd        /run/user/1000/gnupg/S.gpg-agent.extra
unix  2      [ ACC ]     STREAM     LISTENING     254555   37359/dbus-daemon    /run/user/1000/bus
unix  2      [ ACC ]     STREAM     LISTENING     254557   38368/systemd        /run/user/1000/gnupg/S.dirmngr
unix  2      [ ACC ]     STREAM     LISTENING     56107433 50551/Xtightvnc      /tmp/.X11-unix/X1
unix  2      [ ACC ]     STREAM     LISTENING     56107559 50616/ssh-agent      /tmp/ssh-MZ6rPjUc9mDk/agent.50585
unix  2      [ ACC ]     STREAM     LISTENING     25697    1/systemd            /run/snapd.socket
unix  2      [ ACC ]     STREAM     LISTENING     25700    1/systemd            /run/snapd-snap.socket
unix  2      [ ACC ]     STREAM     LISTENING     56030879 39073/ssh-agent      /tmp/ssh-Xc62oacMeIVv/agent.39072
unix  2      [ ACC ]     STREAM     LISTENING     25707    1/systemd            /run/acpid.socket
unix  2      [ ACC ]     STREAM     LISTENING     25710    1/systemd            /var/run/dbus/system_bus_socket
unix  2      [ ACC ]     STREAM     LISTENING     25714    1/systemd            /run/avahi-daemon/socket
unix  2      [ ACC ]     STREAM     LISTENING     25718    1/systemd            /var/run/libvirt/virtlogd-sock
unix  2      [ ACC ]     STREAM     LISTENING     25721    1/systemd            /run/uuidd/request
unix  2      [ ACC ]     STREAM     LISTENING     490792321 1/systemd            /var/run/libvirt/virtlockd-sock
unix  2      [ ACC ]     STREAM     LISTENING     56105881 50576/pcmanfm        /run/user/1000/pcmanfm-socket--1
unix  2      [ ACC ]     STREAM     LISTENING     261024   1/systemd            /run/rpcbind.sock
unix  2      [ ACC ]     STREAM     LISTENING     490858657 58412/nmbd           /var/run/samba/nmbd/unexpected
unix  2      [ ACC ]     STREAM     LISTENING     255909   38368/systemd        /run/user/1000/systemd/private
unix  2      [ ACC ]     STREAM     LISTENING     56065374 43719/ssh-agent      /tmp/ssh-jnhEHuSTBK2X/agent.43569
unix  2      [ ACC ]     STREAM     LISTENING     490709215 56496/iscsid         @ISCSIADM_ABSTRACT_NAMESPACE
unix  2      [ ACC ]     STREAM     LISTENING     489646555 1/systemd            /run/systemd/private
unix  2      [ ACC ]     STREAM     LISTENING     1501     1/systemd            /run/lvm/lvmpolld.socket
unix  2      [ ACC ]     STREAM     LISTENING     56027386 37410/dbus-daemon    @/tmp/dbus-peCBynn1YC
unix  2      [ ACC ]     STREAM     LISTENING     3264832  25259/tmux           /tmp/tmux-1000/default
unix  2      [ ACC ]     STREAM     LISTENING     490792952 57713/libvirtd       /var/run/libvirt/libvirt-sock
unix  2      [ ACC ]     STREAM     LISTENING     490792954 57713/libvirtd       /var/run/libvirt/libvirt-sock-ro
unix  2      [ ACC ]     STREAM     LISTENING     490792956 57713/libvirtd       /var/run/libvirt/libvirt-admin-sock
19wolf commented 5 years ago

The latest error is slightly different

$ lxc list
Error: Get http://unix.socket/1.0: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: no such file or directory
19wolf commented 5 years ago

A system reboot did not solve this

stgraber commented 5 years ago

Ok, can you do:

Posting the output of all of those?

19wolf commented 5 years ago

https://pastebin.com/Tm4QGurM

19wolf commented 5 years ago

It looks a little like I don't even have the unix socket..?

$ sudo ls /var/snap/lxd/common/lxd/unix.socket
ls: cannot access '/var/snap/lxd/common/lxd/unix.socket': No such file or directory

Is there an easy way to get it back?

stgraber commented 5 years ago

That's odd, we keep getting back to port 8443 being busy on your system...

Can you repeat the instructions from above, then at the end also run:

stgraber commented 5 years ago

Oh, actually, it may be a weird port conflict internal to LXD, maybe.

Can you also show:

You may need to install the sqlite3 package for that.

stgraber commented 5 years ago

The error would be consistent with a bad configuration of cluster address vs https address, though not sure why this wasn't caught during configuration, above output should tell us more. hopefully.

19wolf commented 5 years ago

According to netstat, there is nothing on 8443 (also nothing grep lxd or grep lxc)

I notice INSERT INTO config VALUES(3,'cluster.https_address','[::]:8443'); and INSERT INTO config VALUES(4,'core.https_address','192.168.1.1:8443'); don't match, but the system is 192.168.1.1

[15:07:19] nephele@nephele:~$ sudo netstat -lnp | grep 8443
[15:07:23] nephele@nephele:~$ sudo sqlite3 /var/snap/lxd/common/lxd/database/local.db .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE schema (
    id         INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    version    INTEGER NOT NULL,
    updated_at DATETIME NOT NULL,
    UNIQUE (version)
);
INSERT INTO schema VALUES(1,37,1529720435);
INSERT INTO schema VALUES(2,38,1544680613);
CREATE TABLE config (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    key VARCHAR(255) NOT NULL,
    value TEXT,
    UNIQUE (key)
);
INSERT INTO config VALUES(3,'cluster.https_address','[::]:8443');
INSERT INTO config VALUES(4,'core.https_address','192.168.1.1:8443');
CREATE TABLE patches (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    name VARCHAR(255) NOT NULL,
    applied_at DATETIME NOT NULL,
    UNIQUE (name)
);
INSERT INTO patches VALUES(1,'invalid_profile_names',1529720435);
INSERT INTO patches VALUES(2,'leftover_profile_config',1529720435);
INSERT INTO patches VALUES(3,'network_permissions',1529720435);
INSERT INTO patches VALUES(4,'storage_api',1529720435);
INSERT INTO patches VALUES(5,'storage_api_v1',1529720435);
INSERT INTO patches VALUES(6,'storage_api_dir_cleanup',1529720435);
INSERT INTO patches VALUES(7,'storage_api_lvm_keys',1529720435);
INSERT INTO patches VALUES(8,'storage_api_keys',1529720435);
INSERT INTO patches VALUES(9,'storage_api_update_storage_configs',1529720435);
INSERT INTO patches VALUES(10,'storage_api_lxd_on_btrfs',1529720435);
INSERT INTO patches VALUES(11,'storage_api_lvm_detect_lv_size',1529720435);
INSERT INTO patches VALUES(12,'storage_api_insert_zfs_driver',1529720435);
INSERT INTO patches VALUES(13,'storage_zfs_noauto',1529720435);
INSERT INTO patches VALUES(14,'storage_zfs_volume_size',1529720435);
INSERT INTO patches VALUES(15,'network_dnsmasq_hosts',1529720435);
INSERT INTO patches VALUES(16,'storage_api_dir_bind_mount',1529720435);
INSERT INTO patches VALUES(17,'fix_uploaded_at',1529720435);
INSERT INTO patches VALUES(18,'storage_api_ceph_size_remove',1529720435);
INSERT INTO patches VALUES(19,'devices_new_naming_scheme',1529720435);
INSERT INTO patches VALUES(20,'storage_api_permissions',1529720435);
INSERT INTO patches VALUES(21,'container_config_regen',1531453517);
INSERT INTO patches VALUES(22,'lvm_node_specific_config_keys',1532726207);
INSERT INTO patches VALUES(23,'candid_rename_config_key',1534281880);
INSERT INTO patches VALUES(24,'move_backups',1536801812);
INSERT INTO patches VALUES(25,'storage_api_rename_container_snapshots_dir',1539293314);
INSERT INTO patches VALUES(26,'shrink_logs_db_file',1541499067);
INSERT INTO patches VALUES(27,'storage_api_rename_container_snapshots_links',1541747171);
CREATE TABLE raft_nodes (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    address TEXT NOT NULL,
    UNIQUE (address)
);
INSERT INTO raft_nodes VALUES(1,'[::]:8443');
DELETE FROM sqlite_sequence;
INSERT INTO sqlite_sequence VALUES('schema',2);
INSERT INTO sqlite_sequence VALUES('patches',27);
INSERT INTO sqlite_sequence VALUES('config',4);
INSERT INTO sqlite_sequence VALUES('raft_nodes',4);
COMMIT;
stgraber commented 5 years ago
INSERT INTO config VALUES(3,'cluster.https_address','[::]:8443');
INSERT INTO config VALUES(4,'core.https_address','192.168.1.1:8443');

Ok, so that's indeed the problem, I wonder how you ended up with those values in there as they shouldn't be possible based on the checks that @freeekanayaka put in place for this feature...

cluster.https_address should never be allowed to have a wildcard address, that's going to be causing a whole bunch of problems...

Anyway, fixing this should just be a matter of doing:

That's assuming that 192.168.1.1 is a valid IP address for your system.

19wolf commented 5 years ago

Alright that seems to have done the trick, except it's hanging at "sudo lxc info", which is actually the original problem I was trying to fix when I messed everything up... I'll search the other tickets and open a new one if need be.

Thank you for your help!

19wolf commented 5 years ago

So it seems I'm having the same issue as #5079 and #4608, but the fixes in those threads seem to be the same 'fix the https_address' that we had here, and it's not working for me. What could be wrong?

stgraber commented 5 years ago

Ok, lets just get another baseline of where things are at now, can you do:

That should show more of what's happening during daemon startup.

19wolf commented 5 years ago

I threw in a database dump for good measure

$ sudo systemctl stop snap.lxd.daemon.service snap.lxd.daemon.unix.socket
[sudo] password for nephele: 
[12:32:43] nephele@nephele:~$ sudo pkill -9 lxd
[12:32:47] nephele@nephele:~$ sudo lxd --debug --group lxd
DBUG[01-25|12:32:51] Connecting to a local LXD over a Unix socket 
DBUG[01-25|12:32:51] Sending request to LXD                   method=GET url=http://unix.socket/1.0 etag=
INFO[01-25|12:32:51] LXD 3.9 is starting in normal mode       path=/var/snap/lxd/common/lxd
INFO[01-25|12:32:51] Kernel uid/gid map: 
INFO[01-25|12:32:51]  - u 0 0 4294967295 
INFO[01-25|12:32:51]  - g 0 0 4294967295 
INFO[01-25|12:32:51] Configured LXD uid/gid map: 
INFO[01-25|12:32:51]  - u 0 1000000 1000000000 
INFO[01-25|12:32:51]  - g 0 1000000 1000000000 
WARN[01-25|12:32:51] CGroup memory swap accounting is disabled, swap limits will be ignored. 
INFO[01-25|12:32:51] Kernel features: 
INFO[01-25|12:32:51]  - netnsid-based network retrieval: no 
INFO[01-25|12:32:51]  - uevent injection: yes 
INFO[01-25|12:32:51]  - unprivileged file capabilities: yes 
INFO[01-25|12:32:51] Initializing local database 
DBUG[01-25|12:32:51] Initializing database gateway 
DBUG[01-25|12:32:51] Connecting to a local LXD over a Unix socket 
DBUG[01-25|12:32:51] Sending request to LXD                   method=GET url=http://unix.socket/1.0 etag=
DBUG[01-25|12:32:51] Detected stale unix socket, deleting 
DBUG[01-25|12:32:51] Detected stale unix socket, deleting 
INFO[01-25|12:32:51] Starting /dev/lxd handler: 
INFO[01-25|12:32:51]  - binding devlxd socket                 socket=/var/snap/lxd/common/lxd/devlxd/sock
INFO[01-25|12:32:51] REST API daemon: 
INFO[01-25|12:32:51]  - binding Unix socket                   socket=/var/snap/lxd/common/lxd/unix.socket
INFO[01-25|12:32:51]  - binding TCP socket                    socket=192.168.1.1:8443
INFO[01-25|12:32:51] Initializing global database 
DBUG[01-25|12:32:51] Dqlite: server connection failed err=failed to establish network connection: Head https://[::]:8443/internal/database: dial tcp [::]:8443: connect: connection refused address=[::]:8443 attempt=0 
DBUG[01-25|12:32:51] Dqlite: connection failed err=no available dqlite leader server found attempt=0 
DBUG[01-25|12:32:52] Dqlite: server connection failed err=failed to establish network connection: Head https://[::]:8443/internal/database: dial tcp [::]:8443: connect: connection refused address=[::]:8443 attempt=1 
DBUG[01-25|12:32:52] Dqlite: connection failed err=no available dqlite leader server found attempt=1 
DBUG[01-25|12:32:52] Dqlite: server connection failed err=failed to establish network connection: Head https://[::]:8443/internal/database: dial tcp [::]:8443: connect: connection refused address=[::]:8443 attempt=2 
DBUG[01-25|12:32:52] Dqlite: connection failed err=no available dqlite leader server found attempt=2 
DBUG[01-25|12:32:52] Dqlite: server connection failed err=failed to establish network connection: Head https://[::]:8443/internal/database: dial tcp [::]:8443: connect: connection refused address=[::]:8443 attempt=3 
DBUG[01-25|12:32:52] Dqlite: connection failed err=no available dqlite leader server found attempt=3 
DBUG[01-25|12:32:53] Dqlite: server connection failed err=failed to establish network connection: Head https://[::]:8443/internal/database: dial tcp [::]:8443: connect: connection refused address=[::]:8443 attempt=4 
DBUG[01-25|12:32:53] Dqlite: connection failed err=no available dqlite leader server found attempt=4 
DBUG[01-25|12:32:54] Dqlite: server connection failed err=failed to establish network connection: Head https://[::]:8443/internal/database: dial tcp [::]:8443: connect: connection refused address=[::]:8443 attempt=5 
DBUG[01-25|12:32:54] Dqlite: connection failed err=no available dqlite leader server found attempt=5 
DBUG[01-25|12:32:55] Dqlite: server connection failed err=failed to establish network connection: Head https://[::]:8443/internal/database: dial tcp [::]:8443: connect: connection refused address=[::]:8443 attempt=6 
DBUG[01-25|12:32:55] Dqlite: connection failed err=no available dqlite leader server found attempt=6 
DBUG[01-25|12:32:56] Dqlite: server connection failed err=failed to establish network connection: Head https://[::]:8443/internal/database: dial tcp [::]:8443: connect: connection refused address=[::]:8443 attempt=7 
DBUG[01-25|12:32:56] Dqlite: connection failed err=no available dqlite leader server found attempt=7 
DBUG[01-25|12:32:57] Dqlite: server connection failed err=failed to establish network connection: Head https://[::]:8443/internal/database: dial tcp [::]:8443: connect: connection refused address=[::]:8443 attempt=8 
DBUG[01-25|12:32:57] Dqlite: connection failed err=no available dqlite leader server found attempt=8 
^C
[12:32:57] nephele@nephele:~$ sudo sqlite3 /var/snap/lxd/common/lxd/database/local.db .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE schema (
    id         INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    version    INTEGER NOT NULL,
    updated_at DATETIME NOT NULL,
    UNIQUE (version)
);
INSERT INTO schema VALUES(1,37,1529720435);
INSERT INTO schema VALUES(2,38,1544680613);
CREATE TABLE config (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    key VARCHAR(255) NOT NULL,
    value TEXT,
    UNIQUE (key)
);
INSERT INTO config VALUES(3,'cluster.https_address','192.168.1.1:8443');
INSERT INTO config VALUES(4,'core.https_address','192.168.1.1:8443');
CREATE TABLE patches (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    name VARCHAR(255) NOT NULL,
    applied_at DATETIME NOT NULL,
    UNIQUE (name)
);
INSERT INTO patches VALUES(1,'invalid_profile_names',1529720435);
INSERT INTO patches VALUES(2,'leftover_profile_config',1529720435);
INSERT INTO patches VALUES(3,'network_permissions',1529720435);
INSERT INTO patches VALUES(4,'storage_api',1529720435);
INSERT INTO patches VALUES(5,'storage_api_v1',1529720435);
INSERT INTO patches VALUES(6,'storage_api_dir_cleanup',1529720435);
INSERT INTO patches VALUES(7,'storage_api_lvm_keys',1529720435);
INSERT INTO patches VALUES(8,'storage_api_keys',1529720435);
INSERT INTO patches VALUES(9,'storage_api_update_storage_configs',1529720435);
INSERT INTO patches VALUES(10,'storage_api_lxd_on_btrfs',1529720435);
INSERT INTO patches VALUES(11,'storage_api_lvm_detect_lv_size',1529720435);
INSERT INTO patches VALUES(12,'storage_api_insert_zfs_driver',1529720435);
INSERT INTO patches VALUES(13,'storage_zfs_noauto',1529720435);
INSERT INTO patches VALUES(14,'storage_zfs_volume_size',1529720435);
INSERT INTO patches VALUES(15,'network_dnsmasq_hosts',1529720435);
INSERT INTO patches VALUES(16,'storage_api_dir_bind_mount',1529720435);
INSERT INTO patches VALUES(17,'fix_uploaded_at',1529720435);
INSERT INTO patches VALUES(18,'storage_api_ceph_size_remove',1529720435);
INSERT INTO patches VALUES(19,'devices_new_naming_scheme',1529720435);
INSERT INTO patches VALUES(20,'storage_api_permissions',1529720435);
INSERT INTO patches VALUES(21,'container_config_regen',1531453517);
INSERT INTO patches VALUES(22,'lvm_node_specific_config_keys',1532726207);
INSERT INTO patches VALUES(23,'candid_rename_config_key',1534281880);
INSERT INTO patches VALUES(24,'move_backups',1536801812);
INSERT INTO patches VALUES(25,'storage_api_rename_container_snapshots_dir',1539293314);
INSERT INTO patches VALUES(26,'shrink_logs_db_file',1541499067);
INSERT INTO patches VALUES(27,'storage_api_rename_container_snapshots_links',1541747171);
CREATE TABLE raft_nodes (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    address TEXT NOT NULL,
    UNIQUE (address)
);
INSERT INTO raft_nodes VALUES(1,'[::]:8443');
DELETE FROM sqlite_sequence;
INSERT INTO sqlite_sequence VALUES('schema',2);
INSERT INTO sqlite_sequence VALUES('patches',27);
INSERT INTO sqlite_sequence VALUES('config',4);
INSERT INTO sqlite_sequence VALUES('raft_nodes',4);
COMMIT;
stgraber commented 5 years ago

Yeah, so we indeed need to fix the raft_nodes table too.

Can you run:

See if that helps

19wolf commented 5 years ago

Slightly different error now

$ sudo lxd --debug --group lxd
DBUG[01-25|13:04:30] Connecting to a local LXD over a Unix socket 
DBUG[01-25|13:04:30] Sending request to LXD                   method=GET url=http://unix.socket/1.0 etag=
INFO[01-25|13:04:30] LXD 3.9 is starting in normal mode       path=/var/snap/lxd/common/lxd
INFO[01-25|13:04:30] Kernel uid/gid map: 
INFO[01-25|13:04:30]  - u 0 0 4294967295 
INFO[01-25|13:04:30]  - g 0 0 4294967295 
INFO[01-25|13:04:30] Configured LXD uid/gid map: 
INFO[01-25|13:04:30]  - u 0 1000000 1000000000 
INFO[01-25|13:04:30]  - g 0 1000000 1000000000 
WARN[01-25|13:04:30] CGroup memory swap accounting is disabled, swap limits will be ignored. 
INFO[01-25|13:04:30] Kernel features: 
INFO[01-25|13:04:30]  - netnsid-based network retrieval: no 
INFO[01-25|13:04:30]  - uevent injection: yes 
INFO[01-25|13:04:30]  - unprivileged file capabilities: yes 
INFO[01-25|13:04:30] Initializing local database 
DBUG[01-25|13:04:30] Initializing database gateway 
DBUG[01-25|13:04:30] Start database node                      id=1 address=192.168.1.1:8443
DBUG[01-25|13:04:30] Raft: Restored from snapshot 24-1860501-1547537347352 
DBUG[01-25|13:04:30] Raft: Initial configuration (index=1): [{Suffrage:Voter ID:1 Address:0}] 
DBUG[01-25|13:04:30] Raft: Node at 192.168.1.1:8443 [Follower] entering Follower state (Leader: "") 
DBUG[01-25|13:04:30] Dqlite: starting event loop 
DBUG[01-25|13:04:30] Dqlite: accepting connections 
DBUG[01-25|13:04:30] Connecting to a local LXD over a Unix socket 
DBUG[01-25|13:04:30] Sending request to LXD                   method=GET url=http://unix.socket/1.0 etag=
DBUG[01-25|13:04:30] Detected stale unix socket, deleting 
DBUG[01-25|13:04:30] Detected stale unix socket, deleting 
INFO[01-25|13:04:30] Starting /dev/lxd handler: 
INFO[01-25|13:04:30]  - binding devlxd socket                 socket=/var/snap/lxd/common/lxd/devlxd/sock
INFO[01-25|13:04:30] REST API daemon: 
INFO[01-25|13:04:30]  - binding Unix socket                   socket=/var/snap/lxd/common/lxd/unix.socket
INFO[01-25|13:04:30]  - binding TCP socket                    socket=192.168.1.1:8443
INFO[01-25|13:04:30] Initializing global database 
DBUG[01-25|13:04:30] Found cert                               k=0
DBUG[01-25|13:04:30] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=0 
DBUG[01-25|13:04:30] Dqlite: connection failed err=no available dqlite leader server found attempt=0 
DBUG[01-25|13:04:30] Found cert                               k=0
DBUG[01-25|13:04:30] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=1 
DBUG[01-25|13:04:30] Dqlite: connection failed err=no available dqlite leader server found attempt=1 
DBUG[01-25|13:04:30] Found cert                               k=0
DBUG[01-25|13:04:30] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=2 
DBUG[01-25|13:04:30] Dqlite: connection failed err=no available dqlite leader server found attempt=2 
DBUG[01-25|13:04:31] Found cert                               k=0
DBUG[01-25|13:04:31] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=3 
DBUG[01-25|13:04:31] Dqlite: connection failed err=no available dqlite leader server found attempt=3 
DBUG[01-25|13:04:32] Found cert                               k=0
DBUG[01-25|13:04:32] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=4 
DBUG[01-25|13:04:32] Dqlite: connection failed err=no available dqlite leader server found attempt=4 
DBUG[01-25|13:04:33] Found cert                               k=0
DBUG[01-25|13:04:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=5 
DBUG[01-25|13:04:33] Dqlite: connection failed err=no available dqlite leader server found attempt=5 
DBUG[01-25|13:04:34] Found cert                               k=0
DBUG[01-25|13:04:34] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=6 
DBUG[01-25|13:04:34] Dqlite: connection failed err=no available dqlite leader server found attempt=6 
DBUG[01-25|13:04:35] Found cert                               k=0
DBUG[01-25|13:04:35] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=7 
DBUG[01-25|13:04:35] Dqlite: connection failed err=no available dqlite leader server found attempt=7 
WARN[01-25|13:04:36] Raft: Heartbeat timeout from "" reached, starting election 
DBUG[01-25|13:04:36] Raft: Node at 192.168.1.1:8443 [Candidate] entering Candidate state in term 40 
DBUG[01-25|13:04:36] Raft: Votes needed: 1 
DBUG[01-25|13:04:36] Raft: Vote granted from 1 in term 40. Tally: 1 
DBUG[01-25|13:04:36] Raft: Election won. Tally: 1 
DBUG[01-25|13:04:36] Raft: Node at 192.168.1.1:8443 [Leader] entering Leader state 
DBUG[01-25|13:04:36] Found cert                               k=0
DBUG[01-25|13:04:36] Found cert                               k=0
DBUG[01-25|13:04:36] Dqlite: handling new connection (fd=40) 
DBUG[01-25|13:04:36] Dqlite: connected address=192.168.1.1:8443 attempt=8 
DBUG[01-25|13:04:36] Database error: failed to update node version info: updated 0 rows instead of 1 
EROR[01-25|13:04:36] Failed to start the daemon: failed to open cluster database: failed to ensure schema: failed to update node version info: updated 0 rows instead of 1 
INFO[01-25|13:04:36] Starting shutdown sequence 
INFO[01-25|13:04:36] Stopping REST API handler: 
INFO[01-25|13:04:36]  - closing socket                        socket=192.168.1.1:8443
INFO[01-25|13:04:36]  - closing socket                        socket=/var/snap/lxd/common/lxd/unix.socket
INFO[01-25|13:04:36] Stopping /dev/lxd handler: 
INFO[01-25|13:04:36]  - closing socket                        socket=/var/snap/lxd/common/lxd/devlxd/sock
DBUG[01-25|13:04:36] Stop database gateway 
DBUG[01-25|13:04:36] Stop raft instance 
DBUG[01-25|13:04:36] Dqlite: stopping event loop 
EROR[01-25|13:04:36] Dqlite: aborting (fd=40 state=header msg=(null)) 
DBUG[01-25|13:04:36] Dqlite: event loop stopped 
DBUG[01-25|13:04:36] Not unmounting temporary filesystems (containers are still running) 
INFO[01-25|13:04:36] Saving simplestreams cache 
INFO[01-25|13:04:36] Saved simplestreams cache 
Error: failed to open cluster database: failed to ensure schema: failed to update node version info: updated 0 rows instead of 1
stgraber commented 5 years ago

Ok, so you're now hitting an issue because of an address mismatch in the global database's nodes table.

stgraber commented 5 years ago

That's slightly trickier to fix because we can't touch that one with sqlite3 directly, we'll need to sideload a .sql file to run at startup.

stgraber commented 5 years ago
19wolf commented 5 years ago

Looks like there's still a database error

$ echo "UPDATE nodes SET address='192.168.1.1:8443';" | sudo tee /var/snap/lxd/common/lxd/database/patch.global.sql
UPDATE nodes SET address='192.168.1.1:8443';
[13:09:34] nephele@nephele:~$ sudo lxd --debug --group lxd
INFO[01-25|13:09:40] LXD 3.9 is starting in normal mode       path=/var/snap/lxd/common/lxd
INFO[01-25|13:09:40] Kernel uid/gid map: 
INFO[01-25|13:09:40]  - u 0 0 4294967295 
INFO[01-25|13:09:40]  - g 0 0 4294967295 
INFO[01-25|13:09:40] Configured LXD uid/gid map: 
INFO[01-25|13:09:40]  - u 0 1000000 1000000000 
INFO[01-25|13:09:40]  - g 0 1000000 1000000000 
WARN[01-25|13:09:40] CGroup memory swap accounting is disabled, swap limits will be ignored. 
INFO[01-25|13:09:40] Kernel features: 
INFO[01-25|13:09:40]  - netnsid-based network retrieval: no 
INFO[01-25|13:09:40]  - uevent injection: yes 
INFO[01-25|13:09:40]  - unprivileged file capabilities: yes 
INFO[01-25|13:09:40] Initializing local database 
DBUG[01-25|13:09:40] Initializing database gateway 
DBUG[01-25|13:09:40] Start database node                      id=1 address=192.168.1.1:8443
DBUG[01-25|13:09:40] Raft: Restored from snapshot 24-1860501-1547537347352 
DBUG[01-25|13:09:40] Raft: Initial configuration (index=1): [{Suffrage:Voter ID:1 Address:0}] 
DBUG[01-25|13:09:40] Raft: Node at 192.168.1.1:8443 [Follower] entering Follower state (Leader: "") 
DBUG[01-25|13:09:40] Dqlite: starting event loop 
DBUG[01-25|13:09:40] Dqlite: accepting connections 
INFO[01-25|13:09:40] Starting /dev/lxd handler: 
INFO[01-25|13:09:40]  - binding devlxd socket                 socket=/var/snap/lxd/common/lxd/devlxd/sock
INFO[01-25|13:09:40] REST API daemon: 
INFO[01-25|13:09:40]  - binding Unix socket                   socket=/var/snap/lxd/common/lxd/unix.socket
INFO[01-25|13:09:40]  - binding TCP socket                    socket=192.168.1.1:8443
INFO[01-25|13:09:40] Initializing global database 
DBUG[01-25|13:09:40] Found cert                               k=0
DBUG[01-25|13:09:40] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=0 
DBUG[01-25|13:09:40] Dqlite: connection failed err=no available dqlite leader server found attempt=0 
DBUG[01-25|13:09:40] Found cert                               k=0
DBUG[01-25|13:09:40] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=1 
DBUG[01-25|13:09:40] Dqlite: connection failed err=no available dqlite leader server found attempt=1 
DBUG[01-25|13:09:40] Found cert                               k=0
DBUG[01-25|13:09:40] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=2 
DBUG[01-25|13:09:40] Dqlite: connection failed err=no available dqlite leader server found attempt=2 
DBUG[01-25|13:09:41] Found cert                               k=0
DBUG[01-25|13:09:41] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=3 
DBUG[01-25|13:09:41] Dqlite: connection failed err=no available dqlite leader server found attempt=3 
DBUG[01-25|13:09:42] Found cert                               k=0
DBUG[01-25|13:09:42] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=4 
DBUG[01-25|13:09:42] Dqlite: connection failed err=no available dqlite leader server found attempt=4 
DBUG[01-25|13:09:43] Found cert                               k=0
DBUG[01-25|13:09:43] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=5 
DBUG[01-25|13:09:43] Dqlite: connection failed err=no available dqlite leader server found attempt=5 
WARN[01-25|13:09:44] Raft: Heartbeat timeout from "" reached, starting election 
DBUG[01-25|13:09:44] Raft: Node at 192.168.1.1:8443 [Candidate] entering Candidate state in term 55 
DBUG[01-25|13:09:44] Raft: Votes needed: 1 
DBUG[01-25|13:09:44] Raft: Vote granted from 1 in term 55. Tally: 1 
DBUG[01-25|13:09:44] Raft: Election won. Tally: 1 
DBUG[01-25|13:09:44] Raft: Node at 192.168.1.1:8443 [Leader] entering Leader state 
DBUG[01-25|13:09:44] Found cert                               k=0
DBUG[01-25|13:09:44] Found cert                               k=0
DBUG[01-25|13:09:44] Dqlite: handling new connection (fd=36) 
DBUG[01-25|13:09:44] Dqlite: connected address=192.168.1.1:8443 attempt=6 
DBUG[01-25|13:09:44] Running pre-update queries from file for global DB schema 
DBUG[01-25|13:09:44] Database error: failed to execute queries from /var/snap/lxd/common/lxd/database/patch.global.sql: UNIQUE constraint failed: nodes.address 
EROR[01-25|13:09:44] Failed to start the daemon: failed to open cluster database: failed to ensure schema: failed to execute queries from /var/snap/lxd/common/lxd/database/patch.global.sql: UNIQUE constraint failed: nodes.address 
INFO[01-25|13:09:44] Starting shutdown sequence 
INFO[01-25|13:09:44] Stopping REST API handler: 
INFO[01-25|13:09:44]  - closing socket                        socket=192.168.1.1:8443
INFO[01-25|13:09:44]  - closing socket                        socket=/var/snap/lxd/common/lxd/unix.socket
INFO[01-25|13:09:44] Stopping /dev/lxd handler: 
INFO[01-25|13:09:44]  - closing socket                        socket=/var/snap/lxd/common/lxd/devlxd/sock
DBUG[01-25|13:09:44] Stop database gateway 
DBUG[01-25|13:09:44] Stop raft instance 
DBUG[01-25|13:09:44] Dqlite: stopping event loop 
EROR[01-25|13:09:44] Dqlite: aborting (fd=36 state=header msg=(null)) 
DBUG[01-25|13:09:44] Dqlite: event loop stopped 
DBUG[01-25|13:09:44] Not unmounting temporary filesystems (containers are still running) 
INFO[01-25|13:09:44] Saving simplestreams cache 
INFO[01-25|13:09:44] Saved simplestreams cache 
Error: failed to open cluster database: failed to ensure schema: failed to execute queries from /var/snap/lxd/common/lxd/database/patch.global.sql: UNIQUE constraint failed: nodes.address
stgraber commented 5 years ago

Oh, hmm, I didn't expect there to be more than one row in there, let me look at your dump again.

19wolf commented 5 years ago

I'm not all too familiar with sql but it doesn't look like there is one?

[13:11:08] nephele@nephele:~$ sudo sqlite3 /var/snap/lxd/common/lxd/database/local.db .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE schema (
    id         INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    version    INTEGER NOT NULL,
    updated_at DATETIME NOT NULL,
    UNIQUE (version)
);
INSERT INTO schema VALUES(1,37,1529720435);
INSERT INTO schema VALUES(2,38,1544680613);
CREATE TABLE config (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    key VARCHAR(255) NOT NULL,
    value TEXT,
    UNIQUE (key)
);
INSERT INTO config VALUES(3,'cluster.https_address','192.168.1.1:8443');
INSERT INTO config VALUES(4,'core.https_address','192.168.1.1:8443');
CREATE TABLE patches (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    name VARCHAR(255) NOT NULL,
    applied_at DATETIME NOT NULL,
    UNIQUE (name)
);
INSERT INTO patches VALUES(1,'invalid_profile_names',1529720435);
INSERT INTO patches VALUES(2,'leftover_profile_config',1529720435);
INSERT INTO patches VALUES(3,'network_permissions',1529720435);
INSERT INTO patches VALUES(4,'storage_api',1529720435);
INSERT INTO patches VALUES(5,'storage_api_v1',1529720435);
INSERT INTO patches VALUES(6,'storage_api_dir_cleanup',1529720435);
INSERT INTO patches VALUES(7,'storage_api_lvm_keys',1529720435);
INSERT INTO patches VALUES(8,'storage_api_keys',1529720435);
INSERT INTO patches VALUES(9,'storage_api_update_storage_configs',1529720435);
INSERT INTO patches VALUES(10,'storage_api_lxd_on_btrfs',1529720435);
INSERT INTO patches VALUES(11,'storage_api_lvm_detect_lv_size',1529720435);
INSERT INTO patches VALUES(12,'storage_api_insert_zfs_driver',1529720435);
INSERT INTO patches VALUES(13,'storage_zfs_noauto',1529720435);
INSERT INTO patches VALUES(14,'storage_zfs_volume_size',1529720435);
INSERT INTO patches VALUES(15,'network_dnsmasq_hosts',1529720435);
INSERT INTO patches VALUES(16,'storage_api_dir_bind_mount',1529720435);
INSERT INTO patches VALUES(17,'fix_uploaded_at',1529720435);
INSERT INTO patches VALUES(18,'storage_api_ceph_size_remove',1529720435);
INSERT INTO patches VALUES(19,'devices_new_naming_scheme',1529720435);
INSERT INTO patches VALUES(20,'storage_api_permissions',1529720435);
INSERT INTO patches VALUES(21,'container_config_regen',1531453517);
INSERT INTO patches VALUES(22,'lvm_node_specific_config_keys',1532726207);
INSERT INTO patches VALUES(23,'candid_rename_config_key',1534281880);
INSERT INTO patches VALUES(24,'move_backups',1536801812);
INSERT INTO patches VALUES(25,'storage_api_rename_container_snapshots_dir',1539293314);
INSERT INTO patches VALUES(26,'shrink_logs_db_file',1541499067);
INSERT INTO patches VALUES(27,'storage_api_rename_container_snapshots_links',1541747171);
CREATE TABLE raft_nodes (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    address TEXT NOT NULL,
    UNIQUE (address)
);
INSERT INTO raft_nodes VALUES(1,'192.168.1.1:8443');
DELETE FROM sqlite_sequence;
INSERT INTO sqlite_sequence VALUES('schema',2);
INSERT INTO sqlite_sequence VALUES('patches',27);
INSERT INTO sqlite_sequence VALUES('config',4);
INSERT INTO sqlite_sequence VALUES('raft_nodes',4);
COMMIT;
[13:12:03] nephele@nephele:~$ 
stgraber commented 5 years ago

Still not sure why you've got more than one entry in there, but hopefully we can worry about that later :)

stgraber commented 5 years ago

(and yeah, there's a single entry in local.db but apparently two in global.db, not sure what's up with that yet, hopefully that second patch file will get it to start up, then we can query it directly)

19wolf commented 5 years ago

It looks like it's working, successful heartbeats and whatnot, but lxc list is still complaining

edit: it complained because I stopped lxd before sending lxc list, oops.

[13:12:03] nephele@nephele:~$ echo "UPDATE nodes SET address='192.168.1.1:8443' WHERE address='[::]:8443';" | sudo tee /var/snap/lxd/common/lxd/database/patch.global.sql
UPDATE nodes SET address='192.168.1.1:8443' WHERE address='[::]:8443';
[13:13:27] nephele@nephele:~$ sudo lxd --debug --group lxd
INFO[01-25|13:13:32] LXD 3.9 is starting in normal mode       path=/var/snap/lxd/common/lxd
INFO[01-25|13:13:32] Kernel uid/gid map: 
INFO[01-25|13:13:32]  - u 0 0 4294967295 
INFO[01-25|13:13:32]  - g 0 0 4294967295 
INFO[01-25|13:13:32] Configured LXD uid/gid map: 
INFO[01-25|13:13:32]  - u 0 1000000 1000000000 
INFO[01-25|13:13:32]  - g 0 1000000 1000000000 
WARN[01-25|13:13:32] CGroup memory swap accounting is disabled, swap limits will be ignored. 
INFO[01-25|13:13:32] Kernel features: 
INFO[01-25|13:13:32]  - netnsid-based network retrieval: no 
INFO[01-25|13:13:32]  - uevent injection: yes 
INFO[01-25|13:13:32]  - unprivileged file capabilities: yes 
INFO[01-25|13:13:32] Initializing local database 
DBUG[01-25|13:13:32] Initializing database gateway 
DBUG[01-25|13:13:32] Start database node                      id=1 address=192.168.1.1:8443
DBUG[01-25|13:13:32] Raft: Restored from snapshot 24-1860501-1547537347352 
DBUG[01-25|13:13:32] Raft: Initial configuration (index=1): [{Suffrage:Voter ID:1 Address:0}] 
DBUG[01-25|13:13:32] Raft: Node at 192.168.1.1:8443 [Follower] entering Follower state (Leader: "") 
DBUG[01-25|13:13:32] Dqlite: starting event loop 
DBUG[01-25|13:13:32] Dqlite: accepting connections 
INFO[01-25|13:13:32] Starting /dev/lxd handler: 
INFO[01-25|13:13:32]  - binding devlxd socket                 socket=/var/snap/lxd/common/lxd/devlxd/sock
INFO[01-25|13:13:32] REST API daemon: 
INFO[01-25|13:13:32]  - binding Unix socket                   socket=/var/snap/lxd/common/lxd/unix.socket
INFO[01-25|13:13:32]  - binding TCP socket                    socket=192.168.1.1:8443
INFO[01-25|13:13:32] Initializing global database 
DBUG[01-25|13:13:32] Found cert                               k=0
DBUG[01-25|13:13:32] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=0 
DBUG[01-25|13:13:32] Dqlite: connection failed err=no available dqlite leader server found attempt=0 
DBUG[01-25|13:13:33] Found cert                               k=0
DBUG[01-25|13:13:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=1 
DBUG[01-25|13:13:33] Dqlite: connection failed err=no available dqlite leader server found attempt=1 
DBUG[01-25|13:13:33] Found cert                               k=0
DBUG[01-25|13:13:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=2 
DBUG[01-25|13:13:33] Dqlite: connection failed err=no available dqlite leader server found attempt=2 
DBUG[01-25|13:13:33] Found cert                               k=0
DBUG[01-25|13:13:33] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=3 
DBUG[01-25|13:13:33] Dqlite: connection failed err=no available dqlite leader server found attempt=3 
DBUG[01-25|13:13:34] Found cert                               k=0
DBUG[01-25|13:13:34] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=4 
DBUG[01-25|13:13:34] Dqlite: connection failed err=no available dqlite leader server found attempt=4 
DBUG[01-25|13:13:35] Found cert                               k=0
DBUG[01-25|13:13:35] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=5 
DBUG[01-25|13:13:35] Dqlite: connection failed err=no available dqlite leader server found attempt=5 
DBUG[01-25|13:13:36] Found cert                               k=0
DBUG[01-25|13:13:36] Dqlite: server connection failed err=failed to establish network connection: 503 Service Unavailable address=192.168.1.1:8443 attempt=6 
DBUG[01-25|13:13:36] Dqlite: connection failed err=no available dqlite leader server found attempt=6 
WARN[01-25|13:13:37] Raft: Heartbeat timeout from "" reached, starting election 
DBUG[01-25|13:13:37] Raft: Node at 192.168.1.1:8443 [Candidate] entering Candidate state in term 57 
DBUG[01-25|13:13:37] Raft: Votes needed: 1 
DBUG[01-25|13:13:37] Raft: Vote granted from 1 in term 57. Tally: 1 
DBUG[01-25|13:13:37] Raft: Election won. Tally: 1 
DBUG[01-25|13:13:37] Raft: Node at 192.168.1.1:8443 [Leader] entering Leader state 
DBUG[01-25|13:13:37] Found cert                               k=0
DBUG[01-25|13:13:37] Found cert                               k=0
DBUG[01-25|13:13:37] Dqlite: handling new connection (fd=38) 
DBUG[01-25|13:13:37] Dqlite: connected address=192.168.1.1:8443 attempt=7 
DBUG[01-25|13:13:37] Running pre-update queries from file for global DB schema 
INFO[01-25|13:13:37] Initializing storage pools 
DBUG[01-25|13:13:37] Initializing and checking storage pool "default" 
DBUG[01-25|13:13:37] Checking BTRFS storage pool "default" 
INFO[01-25|13:13:37] Initializing networks 
DBUG[01-25|13:13:38] New task operation: 97683799-e4dc-4f25-8ec8-ca661e1f8e9d 
INFO[01-25|13:13:38] Pruning leftover image files 
DBUG[01-25|13:13:38] Started task operation: 97683799-e4dc-4f25-8ec8-ca661e1f8e9d 
INFO[01-25|13:13:38] Done pruning leftover image files 
INFO[01-25|13:13:38] Loading daemon configuration 
DBUG[01-25|13:13:38] Success for task operation: 97683799-e4dc-4f25-8ec8-ca661e1f8e9d 
DBUG[01-25|13:13:38] Initialized inotify with file descriptor 39 
DBUG[01-25|13:13:38] Starting heartbeat round 
DBUG[01-25|13:13:38] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:13:38] Cluster node is up-to-date 
DBUG[01-25|13:13:38] New task operation: 8d7ec910-7aba-4b94-9faf-3006e5fbbaed 
INFO[01-25|13:13:38] Pruning expired images 
DBUG[01-25|13:13:38] Started task operation: 8d7ec910-7aba-4b94-9faf-3006e5fbbaed 
INFO[01-25|13:13:38] Done pruning expired images 
DBUG[01-25|13:13:38] New task operation: b011a539-9a4b-4f78-85da-ac983d0b776d 
INFO[01-25|13:13:38] Pruning expired container backups 
DBUG[01-25|13:13:38] Started task operation: b011a539-9a4b-4f78-85da-ac983d0b776d 
INFO[01-25|13:13:38] Done pruning expired container backups 
DBUG[01-25|13:13:38] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:13:38] New task operation: 4a8034cb-ef22-4cdf-9ecf-0c7d61647401 
INFO[01-25|13:13:38] Expiring log files 
DBUG[01-25|13:13:38] Started task operation: 4a8034cb-ef22-4cdf-9ecf-0c7d61647401 
INFO[01-25|13:13:38] Done expiring log files 
DBUG[01-25|13:13:38] Success for task operation: 4a8034cb-ef22-4cdf-9ecf-0c7d61647401 
DBUG[01-25|13:13:38] Success for task operation: b011a539-9a4b-4f78-85da-ac983d0b776d 
DBUG[01-25|13:13:38] Completed heartbeat round 
DBUG[01-25|13:13:38] New task operation: cfbd3fed-b1e6-455e-ac62-747f7821ea0a 
INFO[01-25|13:13:38] Updating images 
DBUG[01-25|13:13:38] Started task operation: cfbd3fed-b1e6-455e-ac62-747f7821ea0a 
INFO[01-25|13:13:38] Done updating images 
DBUG[01-25|13:13:38] New task operation: 979a859e-5b63-4085-81ec-8a3ac35f1aa3 
INFO[01-25|13:13:38] Updating instance types 
DBUG[01-25|13:13:38] Started task operation: 979a859e-5b63-4085-81ec-8a3ac35f1aa3 
INFO[01-25|13:13:38] Done updating instance types 
DBUG[01-25|13:13:38] Success for task operation: 8d7ec910-7aba-4b94-9faf-3006e5fbbaed 
DBUG[01-25|13:13:38] Database error: &errors.errorString{s:"sql: no rows in result set"} 
DBUG[01-25|13:13:38] Database error: &errors.errorString{s:"sql: no rows in result set"} 
DBUG[01-25|13:13:38] Database error: &errors.errorString{s:"sql: no rows in result set"} 
DBUG[01-25|13:13:38] Database error: &errors.errorString{s:"sql: no rows in result set"} 
DBUG[01-25|13:13:38] Database error: &errors.errorString{s:"sql: no rows in result set"} 
DBUG[01-25|13:13:38] Database error: &errors.errorString{s:"sql: no rows in result set"} 
DBUG[01-25|13:13:38] Success for task operation: cfbd3fed-b1e6-455e-ac62-747f7821ea0a 
DBUG[01-25|13:13:38] Mounting BTRFS storage volume for container "PiHole" on storage pool "default" 
DBUG[01-25|13:13:38] Mounting BTRFS storage pool "default" 
DBUG[01-25|13:13:38] Mounted BTRFS storage volume for container "PiHole" on storage pool "default" 
DBUG[01-25|13:13:38] Mounting BTRFS storage volume for container "PiHole" on storage pool "default" 
DBUG[01-25|13:13:38] Mounting BTRFS storage pool "default" 
DBUG[01-25|13:13:38] Mounted BTRFS storage volume for container "PiHole" on storage pool "default" 
INFO[01-25|13:13:38] Starting container                       used=2019-01-24T20:07:51-0500 stateful=false project=default name=PiHole action=start created=2018-08-06T09:16:43-0400 ephemeral=false
DBUG[01-25|13:13:38] handling                                 method=GET url=/1.0 ip=@
DBUG[01-25|13:13:38] handling                                 method=GET url=/internal/containers/41/onstart ip=@
DBUG[01-25|13:13:38] Mounting BTRFS storage volume for container "PiHole" on storage pool "default" 
DBUG[01-25|13:13:38] Mounting BTRFS storage pool "default" 
DBUG[01-25|13:13:38] Mounted BTRFS storage volume for container "PiHole" on storage pool "default" 
DBUG[01-25|13:13:38] Scheduler: container PiHole started: re-balancing 
INFO[01-25|13:13:38] Started container                        created=2018-08-06T09:16:43-0400 ephemeral=false used=2019-01-24T20:07:51-0500 stateful=false project=default name=PiHole action=start
DBUG[01-25|13:13:38] Scheduler: network: veth35KVMU has been added: updating network priorities 
DBUG[01-25|13:13:38] Scheduler: network: vethILHRPJ has been added: updating network priorities 
DBUG[01-25|13:13:39] Mounting BTRFS storage volume for container "Proxy" on storage pool "default" 
DBUG[01-25|13:13:39] Mounting BTRFS storage pool "default" 
DBUG[01-25|13:13:39] Mounted BTRFS storage volume for container "Proxy" on storage pool "default" 
DBUG[01-25|13:13:39] Mounting BTRFS storage volume for container "Proxy" on storage pool "default" 
DBUG[01-25|13:13:39] Mounting BTRFS storage pool "default" 
DBUG[01-25|13:13:39] Mounted BTRFS storage volume for container "Proxy" on storage pool "default" 
INFO[01-25|13:13:39] Starting container                       project=default name=Proxy action=start created=2018-08-05T20:26:30-0400 ephemeral=false used=2019-01-24T20:07:51-0500 stateful=false
DBUG[01-25|13:13:39] handling                                 method=GET url=/1.0 ip=@
DBUG[01-25|13:13:39] handling                                 method=GET url=/internal/containers/37/onstart ip=@
DBUG[01-25|13:13:39] Mounting BTRFS storage volume for container "Proxy" on storage pool "default" 
DBUG[01-25|13:13:39] Mounting BTRFS storage pool "default" 
DBUG[01-25|13:13:39] Mounted BTRFS storage volume for container "Proxy" on storage pool "default" 
DBUG[01-25|13:13:39] Scheduler: container Proxy started: re-balancing 
INFO[01-25|13:13:39] Started container                        used=2019-01-24T20:07:51-0500 stateful=false project=default name=Proxy action=start created=2018-08-05T20:26:30-0400 ephemeral=false
DBUG[01-25|13:13:39] Scheduler: network: vethLLO8WH has been added: updating network priorities 
DBUG[01-25|13:13:39] Scheduler: network: vethQT8S69 has been added: updating network priorities 
DBUG[01-25|13:13:41] Success for task operation: 979a859e-5b63-4085-81ec-8a3ac35f1aa3 
DBUG[01-25|13:13:42] Starting heartbeat round 
DBUG[01-25|13:13:42] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:13:42] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:13:42] Completed heartbeat round 
DBUG[01-25|13:13:46] Starting heartbeat round 
DBUG[01-25|13:13:46] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:13:46] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:13:46] Completed heartbeat round 
DBUG[01-25|13:13:50] Starting heartbeat round 
DBUG[01-25|13:13:50] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:13:50] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:13:50] Completed heartbeat round 
DBUG[01-25|13:13:54] Starting heartbeat round 
DBUG[01-25|13:13:54] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:13:54] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:13:55] Completed heartbeat round 
DBUG[01-25|13:13:59] Starting heartbeat round 
DBUG[01-25|13:13:59] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:13:59] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:13:59] Completed heartbeat round 
DBUG[01-25|13:14:03] Starting heartbeat round 
DBUG[01-25|13:14:03] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:14:03] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:14:03] Completed heartbeat round 
DBUG[01-25|13:14:07] Starting heartbeat round 
DBUG[01-25|13:14:07] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:14:07] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:14:07] Completed heartbeat round 
DBUG[01-25|13:14:11] Starting heartbeat round 
DBUG[01-25|13:14:11] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:14:11] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:14:11] Completed heartbeat round 
DBUG[01-25|13:14:15] Starting heartbeat round 
DBUG[01-25|13:14:15] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:14:15] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:14:15] Completed heartbeat round 
DBUG[01-25|13:14:19] Starting heartbeat round 
DBUG[01-25|13:14:19] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:14:19] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:14:19] Completed heartbeat round 
DBUG[01-25|13:14:23] Starting heartbeat round 
DBUG[01-25|13:14:23] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:14:23] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:14:23] Completed heartbeat round 
DBUG[01-25|13:14:27] Starting heartbeat round 
DBUG[01-25|13:14:27] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:14:27] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:14:27] Completed heartbeat round 
DBUG[01-25|13:14:31] Starting heartbeat round 
DBUG[01-25|13:14:31] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:14:31] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:14:31] Completed heartbeat round 
DBUG[01-25|13:14:35] Starting heartbeat round 
DBUG[01-25|13:14:35] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:14:35] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:14:35] Completed heartbeat round 
DBUG[01-25|13:14:39] Starting heartbeat round 
DBUG[01-25|13:14:39] Heartbeat updating local raft nodes to [{ID:1 Address:192.168.1.1:8443}] 
DBUG[01-25|13:14:39] Successful heartbeat for 192.168.1.1:8443 
DBUG[01-25|13:14:39] Completed heartbeat round 

^CINFO[01-25|13:14:41] Received 'interrupt signal', exiting 
INFO[01-25|13:14:41] Starting shutdown sequence 
INFO[01-25|13:14:41] Stopping REST API handler: 
INFO[01-25|13:14:41]  - closing socket                        socket=192.168.1.1:8443
INFO[01-25|13:14:41]  - closing socket                        socket=/var/snap/lxd/common/lxd/unix.socket
INFO[01-25|13:14:41] Stopping /dev/lxd handler: 
INFO[01-25|13:14:41]  - closing socket                        socket=/var/snap/lxd/common/lxd/devlxd/sock
INFO[01-25|13:14:41] Closing the database 
DBUG[01-25|13:14:41] Dqlite: closing client 
DBUG[01-25|13:14:41] Stop database gateway 
DBUG[01-25|13:14:41] Stop raft instance 
DBUG[01-25|13:14:41] Dqlite: stopping event loop 
EROR[01-25|13:14:41] Dqlite: aborting (fd=38 state=header msg=(null)) 
DBUG[01-25|13:14:41] Dqlite: event loop stopped 
DBUG[01-25|13:14:41] Not unmounting temporary filesystems (containers are still running) 
INFO[01-25|13:14:41] Saving simplestreams cache 
INFO[01-25|13:14:41] Saved simplestreams cache 
[13:14:41] nephele@nephele:~$ lxc list
Error: Get http://unix.socket/1.0: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: no such file or directory
19wolf commented 5 years ago

Everything seems to be working as expected now. Are there any other logs/debug I can provide?

stgraber commented 5 years ago

Can you show lxc cluster list and lxd sql global "SELECT * FROM nodes;" just to see what's up with that table?

stgraber commented 5 years ago

@freeekanayaka any idea how @19wolf could have ended up with such a backwards cluster.https_address and core.https_address?

19wolf commented 5 years ago

Talos is my new computer I just built, I was attempting to get it to join the Nephele cluster (or Wolf I guess I named it). I don't 100% remember, but I'm pretty sure I never touched Nephele after starting up Talos for the first time and attempting to join the cluster

[13:25:06] nephele@nephele:~$ lxc cluster list
+------+--------------------------+----------+--------+-------------------+
| NAME |           URL            | DATABASE | STATE  |      MESSAGE      |
+------+--------------------------+----------+--------+-------------------+
| Wolf | https://192.168.1.1:8443 | YES      | ONLINE | fully operational |
+------+--------------------------+----------+--------+-------------------+
[13:25:09] nephele@nephele:~$ lxd sql global "SELECT * FROM nodes;"
+----+-------+-------------+------------------+--------+----------------+-------------------------------------+---------+
| id | name  | description |     address      | schema | api_extensions |              heartbeat              | pending |
+----+-------+-------------+------------------+--------+----------------+-------------------------------------+---------+
| 1  | Wolf  |             | 192.168.1.1:8443 | 13     | 115            | 2019-01-25T13:25:17.499609341-05:00 | 0       |
| 4  | Talos |             | 192.168.1.5:8443 | 13     | 115            | 2019-01-11T21:12:27-05:00           | 1       |
+----+-------+-------------+------------------+--------+----------------+-------------------------------------+---------+
stgraber commented 5 years ago

Ok, I think you should be dropping that one out of the database before it causes you more problems, then you can reset it and have it join the cluster again, hopefully this time successfully.

lxd sql global "DELETE FROM nodes WHERE name='Talos';" should do the trick.

19wolf commented 5 years ago

Alright, thank you so much! If you're ever in Rochester, NY, I'll buy you dinner :)

Routhinator commented 4 years ago

Ok, I just hit this problem @stgraber and thankfully the first steps you gave lead me to run

sudo sqlite3 /var/snap/lxd/common/lxd/database/local.db "UPDATE config SET value='192.168.52.4:8443' WHERE key='core.https_address';"

And my issue was fixed.

The question is how are we getting here?

I got here by:

sudo snap install lxd
lxd init <-configured for cluster, primary node. 
sudo snap install juju
juju deploy charmed-kubernetes
juju deploy nfs <- doing this requires apparmor changes to allow lxc containers to mount nfs. I did those, and restarted app armor
sudo reboot <-could not get apparmor permissions to apply
routhinator@andromeda:~$ lxc list
Error: Get http://unix.socket/1.0: dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused
routhinator@andromeda:~$ sudo systemctl status snap.lxd.daemon
● snap.lxd.daemon.service - Service for snap application lxd.daemon
   Loaded: loaded (/etc/systemd/system/snap.lxd.daemon.service; static; vendor preset: enabled)
   Active: failed (Result: exit-code) since Wed 2019-10-09 00:40:17 UTC; 48s ago
  Process: 3229 ExecStart=/usr/bin/snap run lxd.daemon (code=exited, status=1/FAILURE)
 Main PID: 3229 (code=exited, status=1/FAILURE)

Oct 09 00:40:17 andromeda systemd[1]: snap.lxd.daemon.service: Service hold-off time over, scheduling restart.
Oct 09 00:40:17 andromeda systemd[1]: snap.lxd.daemon.service: Scheduled restart job, restart counter is at 10.
Oct 09 00:40:17 andromeda systemd[1]: Stopped Service for snap application lxd.daemon.
Oct 09 00:40:17 andromeda systemd[1]: snap.lxd.daemon.service: Start request repeated too quickly.
Oct 09 00:40:17 andromeda systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Oct 09 00:40:17 andromeda systemd[1]: Failed to start Service for snap application lxd.daemon.
Oct 09 00:40:17 andromeda systemd[1]: snap.lxd.daemon.service: Start request repeated too quickly.
Oct 09 00:40:17 andromeda systemd[1]: snap.lxd.daemon.service: Failed with result 'exit-code'.
Oct 09 00:40:17 andromeda systemd[1]: Failed to start Service for snap application lxd.daemon.

routhinator@andromeda:~$ sudo lsof -i:8443
routhinator@andromeda:~$ netstat -a
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 localhost:domain        0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:ssh             0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:sunrpc          0.0.0.0:*               LISTEN     
tcp        0      0 andromeda.home.rout:ssh chris-desktop.hom:42314 ESTABLISHED
tcp        0      0 andromeda.home.rout:ssh 222.186.175.161:gds-db  ESTABLISHED
tcp        1      1 andromeda.home.rout:ssh 49.88.112.80:49367      LAST_ACK   
tcp        0      0 andromeda.home.rout:994 ursamajor.home.rout:nfs ESTABLISHED
tcp        0      0 andromeda.home.rout:ssh chris-desktop.hom:42358 ESTABLISHED
tcp        0     69 andromeda.home.rout:ssh 27.115.115.218:38570    FIN_WAIT1  
tcp6       0      0 [::]:ssh                [::]:*                  LISTEN     
tcp6       0      0 [::]:sunrpc             [::]:*                  LISTEN     
udp        0      0 localhost:domain        0.0.0.0:*                          
udp        0      0 andromeda.home.r:bootpc 0.0.0.0:*                          
udp        0      0 0.0.0.0:sunrpc          0.0.0.0:*                          
udp        0      0 0.0.0.0:934             0.0.0.0:*                          
udp6       0      0 [::]:sunrpc             [::]:*                             
udp6       0      0 [::]:934                [::]:*                             
raw6       0      0 [::]:ipv6-icmp          [::]:*                  7          
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node   Path
unix  2      [ ]         DGRAM                    29194    /run/user/1002/systemd/notify
unix  2      [ ACC ]     SEQPACKET  LISTENING     15984    /run/udev/control
unix  2      [ ACC ]     STREAM     LISTENING     29197    /run/user/1002/systemd/private
unix  2      [ ACC ]     STREAM     LISTENING     29202    /run/user/1002/gnupg/S.gpg-agent
unix  2      [ ACC ]     STREAM     LISTENING     29203    /run/user/1002/gnupg/S.gpg-agent.browser
unix  2      [ ACC ]     STREAM     LISTENING     29204    /run/user/1002/gnupg/S.gpg-agent.extra
unix  2      [ ACC ]     STREAM     LISTENING     29205    /run/user/1002/gnupg/S.gpg-agent.ssh
unix  2      [ ACC ]     STREAM     LISTENING     29206    /run/user/1002/gnupg/S.dirmngr
unix  2      [ ACC ]     STREAM     LISTENING     31147    /var/run/fail2ban/fail2ban.sock
unix  2      [ ACC ]     STREAM     LISTENING     26956    @irqbalance1134.sock
unix  3      [ ]         DGRAM                    15953    /run/systemd/notify
unix  2      [ ACC ]     STREAM     LISTENING     15956    /run/systemd/private
unix  2      [ ]         DGRAM                    15967    /run/systemd/journal/syslog
unix  2      [ ACC ]     STREAM     LISTENING     15976    /run/lvm/lvmpolld.socket
unix  2      [ ACC ]     STREAM     LISTENING     15978    /run/rpcbind.sock
unix  8      [ ]         DGRAM                    15980    /run/systemd/journal/dev-log
unix  2      [ ACC ]     STREAM     LISTENING     15982    /run/lvm/lvmetad.socket
unix  2      [ ACC ]     STREAM     LISTENING     15986    /run/systemd/journal/stdout
unix  9      [ ]         DGRAM                    15988    /run/systemd/journal/socket
unix  2      [ ACC ]     STREAM     LISTENING     26680    @ISCSIADM_ABSTRACT_NAMESPACE
unix  2      [ ACC ]     STREAM     LISTENING     25908    /run/snapd.socket
unix  2      [ ACC ]     STREAM     LISTENING     25910    /run/snapd-snap.socket
unix  2      [ ACC ]     STREAM     LISTENING     26681    /run/acpid.socket
unix  2      [ ACC ]     STREAM     LISTENING     25912    /var/run/dbus/system_bus_socket
unix  2      [ ACC ]     STREAM     LISTENING     26683    /run/uuidd/request
unix  2      [ ACC ]     STREAM     LISTENING     26940    /var/run/docker-userns.sock
unix  3      [ ]         DGRAM                    15954    
unix  3      [ ]         STREAM     CONNECTED     28571    
unix  3      [ ]         STREAM     CONNECTED     27070    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     26606    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     33239    
unix  3      [ ]         STREAM     CONNECTED     26917    
unix  2      [ ]         DGRAM                    28574    
unix  3      [ ]         STREAM     CONNECTED     27083    /run/systemd/journal/stdout
unix  3      [ ]         DGRAM                    22366    
unix  3      [ ]         DGRAM                    15955    
unix  3      [ ]         STREAM     CONNECTED     29381    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     27084    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     28176    
unix  3      [ ]         STREAM     CONNECTED     28867    /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     28582    
unix  3      [ ]         STREAM     CONNECTED     27591    
unix  3      [ ]         STREAM     CONNECTED     31374    /var/run/dbus/system_bus_socket
unix  3      [ ]         DGRAM                    17315    
unix  3      [ ]         STREAM     CONNECTED     27082    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     28264    
unix  3      [ ]         STREAM     CONNECTED     28097    
unix  2      [ ]         DGRAM                    17302    
unix  3      [ ]         STREAM     CONNECTED     27086    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     19247    /run/systemd/journal/stdout
unix  2      [ ]         DGRAM                    1991     
unix  2      [ ]         DGRAM                    31284    
unix  3      [ ]         STREAM     CONNECTED     27085    /run/systemd/journal/stdout
unix  3      [ ]         DGRAM                    17316    
unix  3      [ ]         STREAM     CONNECTED     25178    
unix  3      [ ]         STREAM     CONNECTED     26761    /var/run/dbus/system_bus_socket
unix  2      [ ]         DGRAM                    25812    
unix  3      [ ]         STREAM     CONNECTED     17300    
unix  3      [ ]         STREAM     CONNECTED     28700    
unix  3      [ ]         STREAM     CONNECTED     28415    
unix  3      [ ]         STREAM     CONNECTED     31544    
unix  3      [ ]         STREAM     CONNECTED     26921    
unix  3      [ ]         DGRAM                    26939    
unix  3      [ ]         STREAM     CONNECTED     31093    
unix  2      [ ]         DGRAM                    26966    
unix  2      [ ]         DGRAM                    31287    
unix  3      [ ]         STREAM     CONNECTED     22444    
unix  3      [ ]         STREAM     CONNECTED     22138    /run/systemd/journal/stdout
unix  2      [ ]         DGRAM                    38718    
unix  3      [ ]         STREAM     CONNECTED     30871    
unix  3      [ ]         STREAM     CONNECTED     26922    /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     25420    
unix  3      [ ]         STREAM     CONNECTED     27489    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     22122    
unix  3      [ ]         STREAM     CONNECTED     31545    
unix  3      [ ]         DGRAM                    26938    
unix  3      [ ]         DGRAM                    22367    
unix  3      [ ]         STREAM     CONNECTED     31094    /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     30397    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     45256    
unix  3      [ ]         DGRAM                    22368    
unix  3      [ ]         STREAM     CONNECTED     45257    
unix  3      [ ]         STREAM     CONNECTED     25939    
unix  3      [ ]         STREAM     CONNECTED     25803    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     30619    /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     26763    /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     26759    
unix  3      [ ]         STREAM     CONNECTED     30524    
unix  2      [ ]         DGRAM                    22359    
unix  3      [ ]         DGRAM                    19348    
unix  3      [ ]         STREAM     CONNECTED     25544    
unix  2      [ ]         DGRAM                    27366    
unix  3      [ ]         DGRAM                    29195    
unix  3      [ ]         STREAM     CONNECTED     27483    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     24337    
unix  3      [ ]         STREAM     CONNECTED     26762    /var/run/dbus/system_bus_socket
unix  3      [ ]         DGRAM                    19441    
unix  3      [ ]         STREAM     CONNECTED     26699    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     27551    
unix  3      [ ]         STREAM     CONNECTED     27071    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     25764    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     22137    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     40765    
unix  2      [ ]         DGRAM                    19437    
unix  3      [ ]         STREAM     CONNECTED     16093    
unix  3      [ ]         STREAM     CONNECTED     27484    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     31092    
unix  3      [ ]         STREAM     CONNECTED     26691    /run/systemd/journal/stdout
unix  3      [ ]         DGRAM                    19442    
unix  3      [ ]         STREAM     CONNECTED     24156    
unix  2      [ ]         DGRAM                    27479    
unix  3      [ ]         STREAM     CONNECTED     26613    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     26760    
unix  3      [ ]         STREAM     CONNECTED     24426    
unix  3      [ ]         DGRAM                    19347    
unix  3      [ ]         STREAM     CONNECTED     24695    
unix  3      [ ]         DGRAM                    29196    
unix  3      [ ]         DGRAM                    22365    
unix  3      [ ]         STREAM     CONNECTED     19243    /run/systemd/journal/stdout
unix  3      [ ]         STREAM     CONNECTED     27104    /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     27217    
unix  3      [ ]         DGRAM                    19439    
unix  2      [ ]         DGRAM                    41994    
unix  2      [ ]         DGRAM                    26758    
unix  3      [ ]         STREAM     CONNECTED     31267    
unix  3      [ ]         STREAM     CONNECTED     40766    
unix  3      [ ]         DGRAM                    19440    

routhinator@andromeda:~$ sudo sqlite3 /var/snap/lxd/common/lxd/database/local.db .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE schema (
    id         INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    version    INTEGER NOT NULL,
    updated_at DATETIME NOT NULL,
    UNIQUE (version)
);
INSERT INTO schema VALUES(1,38,1570577524);
CREATE TABLE config (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    key VARCHAR(255) NOT NULL,
    value TEXT,
    UNIQUE (key)
);
INSERT INTO config VALUES(2,'cluster.https_address','192.168.52.4:8443');
INSERT INTO config VALUES(3,'core.https_address','[::]');
CREATE TABLE patches (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    name VARCHAR(255) NOT NULL,
    applied_at DATETIME NOT NULL,
    UNIQUE (name)
);
INSERT INTO patches VALUES(1,'shrink_logs_db_file',1570577524);
INSERT INTO patches VALUES(2,'invalid_profile_names',1570577524);
INSERT INTO patches VALUES(3,'leftover_profile_config',1570577524);
INSERT INTO patches VALUES(4,'network_permissions',1570577524);
INSERT INTO patches VALUES(5,'storage_api',1570577524);
INSERT INTO patches VALUES(6,'storage_api_v1',1570577524);
INSERT INTO patches VALUES(7,'storage_api_dir_cleanup',1570577524);
INSERT INTO patches VALUES(8,'storage_api_lvm_keys',1570577524);
INSERT INTO patches VALUES(9,'storage_api_keys',1570577524);
INSERT INTO patches VALUES(10,'storage_api_update_storage_configs',1570577524);
INSERT INTO patches VALUES(11,'storage_api_lxd_on_btrfs',1570577524);
INSERT INTO patches VALUES(12,'storage_api_lvm_detect_lv_size',1570577524);
INSERT INTO patches VALUES(13,'storage_api_insert_zfs_driver',1570577524);
INSERT INTO patches VALUES(14,'storage_zfs_noauto',1570577524);
INSERT INTO patches VALUES(15,'storage_zfs_volume_size',1570577524);
INSERT INTO patches VALUES(16,'network_dnsmasq_hosts',1570577524);
INSERT INTO patches VALUES(17,'storage_api_dir_bind_mount',1570577524);
INSERT INTO patches VALUES(18,'fix_uploaded_at',1570577525);
INSERT INTO patches VALUES(19,'storage_api_ceph_size_remove',1570577525);
INSERT INTO patches VALUES(20,'devices_new_naming_scheme',1570577525);
INSERT INTO patches VALUES(21,'storage_api_permissions',1570577525);
INSERT INTO patches VALUES(22,'container_config_regen',1570577525);
INSERT INTO patches VALUES(23,'lvm_node_specific_config_keys',1570577525);
INSERT INTO patches VALUES(24,'candid_rename_config_key',1570577525);
INSERT INTO patches VALUES(25,'move_backups',1570577525);
INSERT INTO patches VALUES(26,'storage_api_rename_container_snapshots_dir',1570577525);
INSERT INTO patches VALUES(27,'storage_api_rename_container_snapshots_links',1570577525);
INSERT INTO patches VALUES(28,'fix_lvm_pool_volume_names',1570577525);
INSERT INTO patches VALUES(29,'storage_api_rename_container_snapshots_dir_again',1570577525);
INSERT INTO patches VALUES(30,'storage_api_rename_container_snapshots_links_again',1570577525);
INSERT INTO patches VALUES(31,'storage_api_rename_container_snapshots_dir_again_again',1570577525);
INSERT INTO patches VALUES(32,'clustering_add_roles',1570577525);
CREATE TABLE raft_nodes (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    address TEXT NOT NULL,
    UNIQUE (address)
);
INSERT INTO raft_nodes VALUES(1,'192.168.52.4:8443');
DELETE FROM sqlite_sequence;
INSERT INTO sqlite_sequence VALUES('schema',1);
INSERT INTO sqlite_sequence VALUES('patches',32);
INSERT INTO sqlite_sequence VALUES('config',3);
INSERT INTO sqlite_sequence VALUES('raft_nodes',1);
COMMIT;
routhinator@andromeda:~$ sudo sqlite3 /var/snap/lxd/common/lxd/database/local.db "UPDATE config SET value='192.168.52.4:8443' WHERE key='core.https_address';"

That's my whole story in commands. Seems like we have a bug here.

For what it's worth, here's my preseed yaml:

routhinator@andromeda:~$ cat lxd-preseed.yaml 
config:
  core.https_address: 192.168.52.4:8443
  core.trust_password: <censored>
networks: []
storage_pools:
- config: {}
  description: ""
  name: local
  driver: dir
profiles:
- config: {}
  description: ""
  devices:
    eth0:
      name: eth0
      nictype: bridged
      parent: br0
      type: nic
    root:
      path: /
      pool: local
      type: disk
  name: default
cluster:
  server_name: andromeda
  enabled: true
  member_config: []
  cluster_address: ""
  cluster_certificate: ""
  server_address: ""
  cluster_password: ""
routhinator@andromeda:~$ lxc --version
3.18

All that... and I still can't get RPC Pipefs to mount in a container.. but that's a different issue.

adjenks commented 1 year ago

I got this error message, checking the status with systemctl I saw that the lxd service was not running. After running sudo journalctl -u snap.lxd.daemon.service I noticed a log that said something about not being able to create a database '42' because the existing version, '43', was newer. I had recently downgraded to lxd 4.0 because 5.0 was causing issues with TLS. So there were probably remnants of version 5.0. So I uninstalled everything lxd and lxc related from snap and dnf and then reinstalled it all and now it works.