Closed omani closed 1 year ago
$ lxc ls
+--------+---------+------------------------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+---------+------------------------------+------+-----------+-----------+
| gala01 | RUNNING | 172.18.0.1 (br-14fa7068c6d1) | | CONTAINER | 0 |
| | | 172.17.0.1 (docker0) | | | |
| | | 10.226.7.89 (eth0) | | | |
+--------+---------+------------------------------+------+-----------+-----------+
| gala02 | RUNNING | 172.18.0.1 (br-14fa7068c6d1) | | CONTAINER | 0 |
| | | 172.17.0.1 (docker0) | | | |
| | | 10.226.7.75 (eth0) | | | |
+--------+---------+------------------------------+------+-----------+-----------+
| gala03 | RUNNING | 172.18.0.1 (br-14fa7068c6d1) | | CONTAINER | 0 |
| | | 172.17.0.1 (docker0) | | | |
| | | 10.226.7.138 (eth0) | | | |
+--------+---------+------------------------------+------+-----------+-----------+
| gala04 | RUNNING | 172.18.0.1 (br-14fa7068c6d1) | | CONTAINER | 0 |
| | | 172.17.0.1 (docker0) | | | |
| | | 10.226.7.102 (eth0) | | | |
+--------+---------+------------------------------+------+-----------+-----------+
| gala05 | RUNNING | 172.18.0.1 (br-14fa7068c6d1) | | CONTAINER | 0 |
| | | 172.17.0.1 (docker0) | | | |
| | | 10.226.7.221 (eth0) | | | |
+--------+---------+------------------------------+------+-----------+-----------+
| gala06 | RUNNING | 172.18.0.1 (br-14fa7068c6d1) | | CONTAINER | 0 |
| | | 172.17.0.1 (docker0) | | | |
| | | 10.226.7.87 (eth0) | | | |
+--------+---------+------------------------------+------+-----------+-----------+
| gala07 | RUNNING | 172.18.0.1 (br-14fa7068c6d1) | | CONTAINER | 0 |
| | | 172.17.0.1 (docker0) | | | |
| | | 10.226.7.151 (eth0) | | | |
+--------+---------+------------------------------+------+-----------+-----------+
| gala08 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
$
$ lxc image ls
+-----------+--------------+--------+-------------------------------------+--------------+-----------+-----------+------------------------------+
| ALIAS | FINGERPRINT | PUBLIC | DESCRIPTION | ARCHITECTURE | TYPE | SIZE | UPLOAD DATE |
+-----------+--------------+--------+-------------------------------------+--------------+-----------+-----------+------------------------------+
| gala-node | 778e1e41211d | yes | Ubuntu jammy amd64 (20221127_07:42) | x86_64 | CONTAINER | 1243.23MB | Nov 28, 2022 at 6:13pm (UTC) |
+-----------+--------------+--------+-------------------------------------+--------------+-----------+-----------+------------------------------+
| | 9b2c8b8a1870 | no | Alpine 3.16 amd64 (20221127_13:00) | x86_64 | CONTAINER | 2.50MB | Nov 28, 2022 at 4:06pm (UTC) |
+-----------+--------------+--------+-------------------------------------+--------------+-----------+-----------+------------------------------+
| | 309874cb3bac | no | Ubuntu jammy amd64 (20221127_07:42) | x86_64 | CONTAINER | 114.77MB | Nov 28, 2022 at 3:18pm (UTC) |
+-----------+--------------+--------+-------------------------------------+--------------+-----------+-----------+------------------------------+
$
$ lxc launch gala-node test
Creating test
Retrieving image: Unpack: 33% (6.46MB/s) <- I pasted this too, so you know it is unpacking.
Starting test
$
$ lxc ls test
+------+---------+------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------+---------+------+------+-----------+-----------+
| test | STOPPED | | | CONTAINER | 0 |
+------+---------+------+------+-----------+-----------+
$
$ lxc info --show-log test
Name: test
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2022/11/29 00:00 +03
Last Used: 2022/11/29 00:03 +03
Log:
lxc gala_test 20221128210301.852 WARN cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_setup_limits_legacy:3147 - Invalid argument - Ignoring legacy cgroup limits on pure cgroup2 system
lxc gala_test 20221128210301.954 WARN cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_setup_limits_legacy:3147 - Invalid argument - Ignoring legacy cgroup limits on pure cgroup2 system
lxc 20221128210302.471 ERROR af_unix - ../src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20221128210302.471 ERROR commands - ../src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_state"
^ these warnings are normal. they also happen on other running nodes.
I would expect some logs, or some error message why the container could not start.
again. let me start this container:
$ lxc start test
$ lxc ls test
+------+---------+------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------+---------+------+------+-----------+-----------+
| test | STOPPED | | | CONTAINER | 0 |
+------+---------+------+------+-----------+-----------+
nothing. since lxc start test
does not output anything I expect it to work (unix philosophy, dont output anything if success.).
but this is confusing.
$ lxc version
Client version: 5.5
Server version: 5.5
Version 5.5 isnt supported, please can you confirm this occurs still with lxd 5.8 and we can reopen. Thanks
Also please look for any errors in dmesg and syslog that may indicate the problem. You may be hitting a sysctl limit.
Version 5.5 isnt supported, please can you confirm this occurs still with lxd 5.8 and we can reopen. Thanks
5.8 is not available on Alpine Linux yet. Im on Alpine 3.17. only 5.0.1 (lxd) or 5.5 (lxd-feature) package available.
not even edge or testing has it. have to wait. or maybe I will build it from source.
Ok. We generally ask that support issues (especially those to do with specific environments) be posted at https://discuss.linuxcontainers.org/ first and we can always promote to github if it is a bug.
Please can you post there. Also alpine carries lxd 5.0.1 LTS too so you could try that too as that would give another data point. That version is also supported.
Ive built LXD release 5.8 from source on Alpine Linux 3.17.
I encounter the same problem:
$ lxc ls
+--------+---------+------------------------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+---------+------------------------------+------+-----------+-----------+
| gala01 | RUNNING | 172.18.0.1 (br-fee7b61d8331) | | CONTAINER | 0 |
| | | 172.17.0.1 (docker0) | | | |
| | | 10.245.2.243 (eth0) | | | |
+--------+---------+------------------------------+------+-----------+-----------+
| gala02 | RUNNING | 10.245.2.196 (eth0) | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala03 | RUNNING | 10.245.2.158 (eth0) | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala04 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala05 | RUNNING | 10.245.2.37 (eth0) | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala06 | RUNNING | 10.245.2.240 (eth0) | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala07 | RUNNING | 10.245.2.45 (eth0) | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala08 | RUNNING | 10.245.2.244 (eth0) | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala09 | RUNNING | 10.245.2.67 (eth0) | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala10 | RUNNING | 10.245.2.27 (eth0) | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala11 | RUNNING | 10.245.2.10 (eth0) | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala12 | RUNNING | 10.245.2.249 (eth0) | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala13 | RUNNING | 10.245.2.247 (eth0) | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala14 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala15 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala16 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala17 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala18 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala19 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
| gala20 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------------------------------+------+-----------+-----------+
gala01 has been deployed with docker etc. hence the IPs. but all other nodes are pure ubuntu nodes. nothing done on them. just launched and nothing else.
Ive started the nodes with this simple for loop:
for i in `seq -w 2 20`; do lxc launch images:ubuntu/22.04 gala$i; done
many nodes are just in STOPPED state. when I start eg. gala20, nothing happens:
$ lxc start gala20
$ lxc ls gala20
+--------+---------+------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+---------+------+------+-----------+-----------+
| gala20 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
lxc start gala20
gives the impression that the starting of the node was successful. but an immediate lxc ls
of that node shows that state is still STOPPED
.
the log for that node shows this:
# tail -f /var/log/lxd/gala_gala20/lxc.log
lxc gala_gala20 20221129005309.578 WARN cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_setup_limits_legacy:3147 - Invalid argument - Ignoring legacy cgroup limits on pure cgroup2 system
lxc gala_gala20 20221129005309.186 WARN cgfsng - ../src/lxc/cgroups/cgfsng.c:cgfsng_setup_limits_legacy:3147 - Invalid argument - Ignoring legacy cgroup limits on pure cgroup2 system
lxc 20221129005309.488 ERROR af_unix - ../src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20221129005309.488 ERROR commands - ../src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_state"
# lxd version
5.8
# lxc version
Client version: 5.8
Server version: 5.8
there are no log entries in /var/log/messages
regarding this at all.
I doubt that I hit any sysctl limits. the only thing I did was launching 20 ubuntu nodes. though this error message Failed to receive file descriptors for command "get_state"
could be a hint that I ran out of file descriptors. after 12 running nodes?
I thought I try to reproduce this on another machine of mine (local machine).
now when I want to launch an ubuntu image I get:
$ lxc launch images:ubuntu/22.04 u1
Creating u1
Error: Failed instance creation: Unable to fetch https://images.linuxcontainers.org/images/ubuntu/jammy/amd64/default/20221127_07:42/lxd.tar.xz: 404 Not Found
^ what's up with this messed up URL anyway?
this is a completely different machine. alpine 3.16. lxd 5.0 (apk add lxd).
it's getting really annoying. I get more and more disappointed with this piece of software. what's going on here? the only thing I did with LXD was troubleshooting it (for 10 hours now). error after error.
I was 0 productive today with LXD. bad experience. Im a big fan of LXD but this is a no-go (Im talking to myself).
follow up: I was afk for 1 hour and came back and thought why not try again and now suddenly it works:
lxc launch images:ubuntu/22.04 u1
Creating u1
Starting u1
it retrieved and downloaded the rootfs from https://images.linuxcontainers.org
. just like that. I didnt do anything.
ok here the result on my local machine:
$ for i in `seq -w 1 20`; do sudo lxc launch images:ubuntu/22.04 gala$i; done
Creating gala01
Starting gala01
Creating gala02
Starting gala02
Creating gala03
Starting gala03
Creating gala04
Starting gala04
Creating gala05
Starting gala05
Creating gala06
Starting gala06
Creating gala07
Starting gala07
Creating gala08
Starting gala08
Creating gala09
Starting gala09
Creating gala10
Starting gala10
Creating gala11
Starting gala11
Creating gala12
Starting gala12
Creating gala13
Starting gala13
Creating gala14
Starting gala14
Creating gala15
Starting gala15
Creating gala16
Starting gala16
Creating gala17
Starting gala17
Creating gala18
Starting gala18
Creating gala19
Starting gala19
Creating gala20
Starting gala20
+--------+---------+----------------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+---------+----------------------+------+-----------+-----------+
| gala01 | RUNNING | 10.177.95.199 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala02 | RUNNING | 10.177.95.119 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala03 | RUNNING | 10.177.95.197 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala04 | RUNNING | 10.177.95.140 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala05 | RUNNING | 10.177.95.135 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala06 | RUNNING | 10.177.95.181 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala07 | RUNNING | 10.177.95.122 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala08 | RUNNING | 10.177.95.159 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala09 | RUNNING | 10.177.95.168 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala10 | RUNNING | 10.177.95.188 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala11 | RUNNING | 10.177.95.193 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala12 | RUNNING | 10.177.95.149 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala13 | RUNNING | 10.177.95.182 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala14 | STOPPED | | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala15 | STOPPED | | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala16 | STOPPED | | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala17 | STOPPED | | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala18 | STOPPED | | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala19 | STOPPED | | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala20 | STOPPED | | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
same effect. it stopped working after 13 containers.
$ lxc version
Client version: 4.0.9
Server version: 4.0.9
doesnt matter which version it is.
please somebody try to reproduce this. loop over 20 machines, launch ubuntu and see what happens.
We do indeed do this sort of thing, for hundreds of containers, concurrently, everyday as part of our automated performance tests. So this will most likely be something environmental in your setup which is why it is more appropriate to post over at https://discuss.linuxcontainers.org/ to get support.
I'll try and reproduce.
Our build infrastructure is currently moving which may explain the 404 error on the image.
As you're not using the snap package but instead a third party package please can you provide the commands/steps you used to setup lxd on alpine. Along with the image you are using and the config for the instances (lxc config show (instance) --expanded))
$ lxc start --debug gala13
DBUG[11-29|16:50:49] Connecting to a local LXD over a Unix socket
DBUG[11-29|16:50:49] Sending request to LXD method=GET url=http://unix.socket/1.0 etag=
DBUG[11-29|16:50:49] Got response struct from LXD
DBUG[11-29|16:50:49]
{
"config": {
"images.auto_update_interval": "0"
},
"api_extensions": [
"storage_zfs_remove_snapshots",
"container_host_shutdown_timeout",
"container_stop_priority",
"container_syscall_filtering",
"auth_pki",
"container_last_used_at",
"etag",
"patch",
"usb_devices",
"https_allowed_credentials",
"image_compression_algorithm",
"directory_manipulation",
"container_cpu_time",
"storage_zfs_use_refquota",
"storage_lvm_mount_options",
"network",
"profile_usedby",
"container_push",
"container_exec_recording",
"certificate_update",
"container_exec_signal_handling",
"gpu_devices",
"container_image_properties",
"migration_progress",
"id_map",
"network_firewall_filtering",
"network_routes",
"storage",
"file_delete",
"file_append",
"network_dhcp_expiry",
"storage_lvm_vg_rename",
"storage_lvm_thinpool_rename",
"network_vlan",
"image_create_aliases",
"container_stateless_copy",
"container_only_migration",
"storage_zfs_clone_copy",
"unix_device_rename",
"storage_lvm_use_thinpool",
"storage_rsync_bwlimit",
"network_vxlan_interface",
"storage_btrfs_mount_options",
"entity_description",
"image_force_refresh",
"storage_lvm_lv_resizing",
"id_map_base",
"file_symlinks",
"container_push_target",
"network_vlan_physical",
"storage_images_delete",
"container_edit_metadata",
"container_snapshot_stateful_migration",
"storage_driver_ceph",
"storage_ceph_user_name",
"resource_limits",
"storage_volatile_initial_source",
"storage_ceph_force_osd_reuse",
"storage_block_filesystem_btrfs",
"resources",
"kernel_limits",
"storage_api_volume_rename",
"macaroon_authentication",
"network_sriov",
"console",
"restrict_devlxd",
"migration_pre_copy",
"infiniband",
"maas_network",
"devlxd_events",
"proxy",
"network_dhcp_gateway",
"file_get_symlink",
"network_leases",
"unix_device_hotplug",
"storage_api_local_volume_handling",
"operation_description",
"clustering",
"event_lifecycle",
"storage_api_remote_volume_handling",
"nvidia_runtime",
"container_mount_propagation",
"container_backup",
"devlxd_images",
"container_local_cross_pool_handling",
"proxy_unix",
"proxy_udp",
"clustering_join",
"proxy_tcp_udp_multi_port_handling",
"network_state",
"proxy_unix_dac_properties",
"container_protection_delete",
"unix_priv_drop",
"pprof_http",
"proxy_haproxy_protocol",
"network_hwaddr",
"proxy_nat",
"network_nat_order",
"container_full",
"candid_authentication",
"backup_compression",
"candid_config",
"nvidia_runtime_config",
"storage_api_volume_snapshots",
"storage_unmapped",
"projects",
"candid_config_key",
"network_vxlan_ttl",
"container_incremental_copy",
"usb_optional_vendorid",
"snapshot_scheduling",
"snapshot_schedule_aliases",
"container_copy_project",
"clustering_server_address",
"clustering_image_replication",
"container_protection_shift",
"snapshot_expiry",
"container_backup_override_pool",
"snapshot_expiry_creation",
"network_leases_location",
"resources_cpu_socket",
"resources_gpu",
"resources_numa",
"kernel_features",
"id_map_current",
"event_location",
"storage_api_remote_volume_snapshots",
"network_nat_address",
"container_nic_routes",
"rbac",
"cluster_internal_copy",
"seccomp_notify",
"lxc_features",
"container_nic_ipvlan",
"network_vlan_sriov",
"storage_cephfs",
"container_nic_ipfilter",
"resources_v2",
"container_exec_user_group_cwd",
"container_syscall_intercept",
"container_disk_shift",
"storage_shifted",
"resources_infiniband",
"daemon_storage",
"instances",
"image_types",
"resources_disk_sata",
"clustering_roles",
"images_expiry",
"resources_network_firmware",
"backup_compression_algorithm",
"ceph_data_pool_name",
"container_syscall_intercept_mount",
"compression_squashfs",
"container_raw_mount",
"container_nic_routed",
"container_syscall_intercept_mount_fuse",
"container_disk_ceph",
"virtual-machines",
"image_profiles",
"clustering_architecture",
"resources_disk_id",
"storage_lvm_stripes",
"vm_boot_priority",
"unix_hotplug_devices",
"api_filtering",
"instance_nic_network",
"clustering_sizing",
"firewall_driver",
"projects_limits",
"container_syscall_intercept_hugetlbfs",
"limits_hugepages",
"container_nic_routed_gateway",
"projects_restrictions",
"custom_volume_snapshot_expiry",
"volume_snapshot_scheduling",
"trust_ca_certificates",
"snapshot_disk_usage",
"clustering_edit_roles",
"container_nic_routed_host_address",
"container_nic_ipvlan_gateway",
"resources_usb_pci",
"resources_cpu_threads_numa",
"resources_cpu_core_die",
"api_os",
"resources_system",
"usedby_consistency",
"resources_gpu_mdev",
"console_vga_type",
"projects_limits_disk",
"storage_rsync_compression",
"gpu_mdev",
"resources_pci_iommu",
"resources_network_usb",
"resources_disk_address",
"network_state_vlan",
"gpu_sriov",
"migration_stateful",
"disk_state_quota",
"storage_ceph_features",
"gpu_mig",
"clustering_join_token",
"clustering_description",
"server_trusted_proxy",
"clustering_update_cert",
"storage_api_project",
"server_instance_driver_operational",
"server_supported_storage_drivers",
"event_lifecycle_requestor_address",
"resources_gpu_usb",
"network_counters_errors_dropped",
"image_source_project",
"database_leader",
"instance_all_projects",
"ceph_rbd_du",
"qemu_metrics",
"gpu_mig_uuid",
"event_project",
"instance_allow_inconsistent_copy",
"image_restrictions"
],
"api_status": "stable",
"api_version": "1.0",
"auth": "trusted",
"public": false,
"auth_methods": [
"tls"
],
"environment": {
"addresses": [],
"architectures": [
"x86_64",
"i686"
],
"certificate": "-----BEGIN CERTIFICATE-----\nbla\n-----END CERTIFICATE-----\n",
"certificate_fingerprint": "aa60457f61be62bea58b40bff7e075a7afb6c049b77a343bfacfe414acb3fb7a",
"driver": "lxc | qemu",
"driver_version": "4.0.12 | 7.0.0",
"firewall": "nftables",
"kernel": "Linux",
"kernel_architecture": "x86_64",
"kernel_features": {
"netnsid_getifaddrs": "true",
"seccomp_listener": "true",
"seccomp_listener_continue": "true",
"shiftfs": "false",
"uevent_injection": "true",
"unpriv_fscaps": "true"
},
"kernel_version": "5.15.78-0-lts",
"lxc_features": {
"cgroup2": "true",
"core_scheduling": "true",
"devpts_fd": "true",
"idmapped_mounts_v2": "true",
"mount_injection_file": "true",
"network_gateway_device_route": "true",
"network_ipvlan": "true",
"network_l2proxy": "true",
"network_phys_macvlan_mtu": "true",
"network_veth_router": "true",
"pidfd": "true",
"seccomp_allow_deny_syntax": "true",
"seccomp_notify": "true",
"seccomp_proxy_send_notify_fd": "true"
},
"os_name": "Alpine Linux",
"os_version": "3.16.3",
"project": "default",
"server": "lxd",
"server_clustered": false,
"server_name": "home",
"server_pid": 7111,
"server_version": "4.0.9",
"storage": "dir",
"storage_version": "1",
"storage_supported_drivers": [
{
"Name": "dir",
"Version": "1",
"Remote": false
}
]
}
}
DBUG[11-29|16:50:49] Sending request to LXD method=GET url=http://unix.socket/1.0/instances/gala13 etag=
DBUG[11-29|16:50:49] Got response struct from LXD
DBUG[11-29|16:50:49]
{
"architecture": "x86_64",
"config": {
"image.architecture": "amd64",
"image.description": "Ubuntu jammy amd64 (20221127_07:42)",
"image.os": "Ubuntu",
"image.release": "jammy",
"image.serial": "20221127_07:42",
"image.type": "squashfs",
"image.variant": "default",
"volatile.base_image": "309874cb3bac23616ebca180db7b6d1f151175869e716d079cb28e1a103a143c",
"volatile.eth0.hwaddr": "00:16:3e:6e:32:f9",
"volatile.idmap.base": "0",
"volatile.idmap.current": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]",
"volatile.idmap.next": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]",
"volatile.last_state.idmap": "[]",
"volatile.last_state.power": "STOPPED",
"volatile.uuid": "908a2585-00e5-44ed-83dd-08c6aca08a9d"
},
"devices": {},
"ephemeral": false,
"profiles": [
"default"
],
"stateful": false,
"description": "",
"created_at": "2022-11-29T03:33:29.353883422Z",
"expanded_config": {
"image.architecture": "amd64",
"image.description": "Ubuntu jammy amd64 (20221127_07:42)",
"image.os": "Ubuntu",
"image.release": "jammy",
"image.serial": "20221127_07:42",
"image.type": "squashfs",
"image.variant": "default",
"volatile.base_image": "309874cb3bac23616ebca180db7b6d1f151175869e716d079cb28e1a103a143c",
"volatile.eth0.hwaddr": "00:16:3e:6e:32:f9",
"volatile.idmap.base": "0",
"volatile.idmap.current": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]",
"volatile.idmap.next": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]",
"volatile.last_state.idmap": "[]",
"volatile.last_state.power": "STOPPED",
"volatile.uuid": "908a2585-00e5-44ed-83dd-08c6aca08a9d"
},
"expanded_devices": {
"eth0": {
"name": "eth0",
"network": "lxdbr0",
"type": "nic"
},
"root": {
"path": "/",
"pool": "default",
"type": "disk"
}
},
"name": "gala13",
"status": "Stopped",
"status_code": 102,
"last_used_at": "2022-11-29T13:47:22.125783839Z",
"location": "none",
"type": "container",
"project": "default"
}
DBUG[11-29|16:50:49] Connected to the websocket: ws://unix.socket/1.0/events
DBUG[11-29|16:50:49] Sending request to LXD method=PUT url=http://unix.socket/1.0/instances/gala13/state etag=
DBUG[11-29|16:50:49]
{
"action": "start",
"timeout": 0,
"force": false,
"stateful": false
}
DBUG[11-29|16:50:49] Got operation from LXD
DBUG[11-29|16:50:49]
{
"id": "514d49c6-6d56-437d-a7d2-cecfa2c22b47",
"class": "task",
"description": "Starting instance",
"created_at": "2022-11-29T16:50:49.386043636+03:00",
"updated_at": "2022-11-29T16:50:49.386043636+03:00",
"status": "Running",
"status_code": 103,
"resources": {
"instances": [
"/1.0/instances/gala13"
]
},
"metadata": null,
"may_cancel": false,
"err": "",
"location": "none"
}
DBUG[11-29|16:50:49] Sending request to LXD method=GET url=http://unix.socket/1.0/operations/514d49c6-6d56-437d-a7d2-cecfa2c22b47 etag=
DBUG[11-29|16:50:49] Got response struct from LXD
DBUG[11-29|16:50:49]
{
"id": "514d49c6-6d56-437d-a7d2-cecfa2c22b47",
"class": "task",
"description": "Starting instance",
"created_at": "2022-11-29T16:50:49.386043636+03:00",
"updated_at": "2022-11-29T16:50:49.386043636+03:00",
"status": "Running",
"status_code": 103,
"resources": {
"instances": [
"/1.0/instances/gala13"
]
},
"metadata": null,
"may_cancel": false,
"err": "",
"location": "none"
}
$ lxc ls gala13 --debug
...
{
"architecture": "x86_64",
"config": {
"image.architecture": "amd64",
"image.description": "Ubuntu jammy amd64 (20221127_07:42)",
"image.os": "Ubuntu",
"image.release": "jammy",
"image.serial": "20221127_07:42",
"image.type": "squashfs",
"image.variant": "default",
"volatile.base_image": "309874cb3bac23616ebca180db7b6d1f151175869e716d079cb28e1a103a143c",
"volatile.eth0.hwaddr": "00:16:3e:6e:32:f9",
"volatile.idmap.base": "0",
"volatile.idmap.current": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]",
"volatile.idmap.next": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]",
"volatile.last_state.idmap": "[]",
"volatile.last_state.power": "STOPPED",
"volatile.uuid": "908a2585-00e5-44ed-83dd-08c6aca08a9d"
},
"devices": {},
"ephemeral": false,
"profiles": [
"default"
],
"stateful": false,
"description": "",
"created_at": "2022-11-29T03:33:29.353883422Z",
"expanded_config": {
"image.architecture": "amd64",
"image.description": "Ubuntu jammy amd64 (20221127_07:42)",
"image.os": "Ubuntu",
"image.release": "jammy",
"image.serial": "20221127_07:42",
"image.type": "squashfs",
"image.variant": "default",
"volatile.base_image": "309874cb3bac23616ebca180db7b6d1f151175869e716d079cb28e1a103a143c",
"volatile.eth0.hwaddr": "00:16:3e:6e:32:f9",
"volatile.idmap.base": "0",
"volatile.idmap.current": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]",
"volatile.idmap.next": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]",
"volatile.last_state.idmap": "[]",
"volatile.last_state.power": "STOPPED",
"volatile.uuid": "908a2585-00e5-44ed-83dd-08c6aca08a9d"
},
"expanded_devices": {
"eth0": {
"name": "eth0",
"network": "lxdbr0",
"type": "nic"
},
"root": {
"path": "/",
"pool": "default",
"type": "disk"
}
},
"name": "gala13",
"status": "Stopped",
"status_code": 102,
"last_used_at": "2022-11-29T13:50:49.47375582Z",
"location": "none",
"type": "container",
"project": "default"
},
...
again, notice how lxc start
gets reponse RUNNING
back from API but a following lxc ls $name
show state STOPPED
in the response.
Please can you advise on the steps you used to install and configure LXD on Alpine? So we can attempt to reproduce.
@tomponline we can do an SSH session together in tmux or screen (whatever you prefer) on the live system so you can have a look around for yourself.
Please can you advise on the steps you used to install and configure LXD on Alpine? So we can attempt to reproduce.
nothing fancy. installed alpine 3.17 and invoked the command: apk add lxd
and you get version 5.0.1.
I also tried 5.8 by building from source. which I did according to the official "Installation" docs.
regardless, in both versions I see the same issue. that is on the server. plus, on my local machine, completely different version, I see the same issue as well.
hence my offer to do a live interactive SSH session together on the system that has this issue. we can treat it as our lab environment. Im gonna provision the whole server again from scratch anyway, once we've finished.
I'll try and reproduce later when I am able to work on this.
We just added the Alpine 3.17 image to our builders so hopefully that will be built soon and I can just use that to spin up a test VM.
We just added the Alpine 3.17 image to our builders so hopefully that will be built soon and I can just use that to spin up a test VM.
can you give us details to the build? for example, when I build LXD on alpine I have to manually fix some things by hand. eg:
CC src/bind.lo
cc1: error: /usr/local/include: No such file or directory [-Werror=missing-include-dirs]
cc1: all warnings being treated as errors
make[1]: *** [Makefile:1311: src/bind.lo] Error 1
so I have to sudo ln -s /usr/lib/include /usr/local
symlink just so that make deps
passes the dqlite part.
furthermore, there is an issue with the exports after make deps. etc. without going into details. I guess the same errors you get when building LXD from source on alpine. because the makefile in the LXD repository is not suited for alpine. eg. you have to add -lintl
and -luv
to the CGO_LDFLAGS.
it would be nice if you could give /etc/apk/world
from your minimal build system for Alpine Linux 3.17. and your altered Makefile if possible.
oh, would you look at that:
+--------+---------+----------------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+---------+----------------------+------+-----------+-----------+
| gala01 | RUNNING | 10.177.95.120 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala02 | RUNNING | 10.177.95.176 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala03 | RUNNING | 10.177.95.129 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala04 | RUNNING | 10.177.95.183 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala05 | RUNNING | 10.177.95.180 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala06 | RUNNING | 10.177.95.152 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala07 | RUNNING | 10.177.95.162 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala08 | RUNNING | 10.177.95.170 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala09 | RUNNING | 10.177.95.163 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala10 | RUNNING | 10.177.95.128 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala11 | RUNNING | 10.177.95.171 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala12 | RUNNING | 10.177.95.127 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala13 | RUNNING | 10.177.95.113 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala14 | RUNNING | 10.177.95.187 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala15 | RUNNING | 10.177.95.143 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala16 | RUNNING | 10.177.95.160 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala17 | RUNNING | 10.177.95.111 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala18 | RUNNING | 10.177.95.110 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala19 | RUNNING | 10.177.95.141 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
| gala20 | RUNNING | 10.177.95.144 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
this is done with:
for i in `seq -w 1 20`; do lxc launch images:alpine/3.16 gala$i; done
that means, alpine images work! but not ubuntu.
now look at this:
$ for i in `seq -w 1 20`; do lxc launch images:ubuntu/22.04 gala$i; done
+--------+---------+------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+---------+------+------+-----------+-----------+
| gala01 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala02 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala03 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala04 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala05 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala06 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala07 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala08 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala09 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala10 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala11 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala12 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala13 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala14 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala15 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala16 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala17 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala18 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala19 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
| gala20 | STOPPED | | | CONTAINER | 0 |
+--------+---------+------+------+-----------+-----------+
all in STOPPED
state!
same for ubuntu/20.04.
so that means I could boil this down to the images. the question is: what is my LXD host missing that it is not able to start an ubuntu image. what does the ubuntu image need from the host? maybe some packages missing on the LXD host?
this issue is reproducable for me.
/var/log/lxd/gala01/console.log
seems to have some valuable information:
Failed to look up module alias 'autofs4': Function not implemented
Failed to mount cgroup at /sys/fs/cgroup/systemd: Operation not permitted
[ESC[0;1;31m!!!!!!ESC[0m] Failed to mount API filesystems.
Exiting PID 1...
something told me it has to do with systemd. who doesn't like this awesome invention named systemd? so, what can I do to fix this?
It needs the cgroups for systemd. There's a setting in the alpine lxc init script for that.
let's see how I can fix this. I see an issue regarding this https://github.com/lxc/lxc/issues/4072. since I dont have grub on alpine 3.17 I will go with the solution mentioned in the last comment in that issue.
mkdir -p /sys/fs/cgroup/systemd && mount -t cgroup cgroup -o none,name=systemd /sys/fs/cgroup/systemd
$ lxc start gala01
$ lxc ls gala01
+--------+---------+----------------------+------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------+---------+----------------------+------+-----------+-----------+
| gala01 | RUNNING | 10.177.95.200 (eth0) | | CONTAINER | 0 |
+--------+---------+----------------------+------+-----------+-----------+
there you go!
Although it doesnt explain why you were able to start 7 of them before, otherwise I would have suggested this sooner, as its a common issue for those running ubuntu on alpine.
Although it doesnt explain why you were able to start 7 of them before, otherwise I would have suggested this sooner, as its a common issue for those running ubuntu on alpine.
unfortunate.
but since I've built from source, I dont have /etc/init.d/lx*
.
is there another option for me? except the above step with creating it manually.
Yep see https://github.com/lxc/lxd/issues/11167#issuecomment-1331198814 as if you start the lxc service (without any lxc containers) with that setting it will Mount the cgroups for you.
https://git.alpinelinux.org/aports/tree/main/lxc/lxc.initd?h=3.17-stable#n13
Yes because its the lxc service from alpine not the lxd service you built manually (its confusing but not something we have control over).
yes but it says # Configuration for /etc/init.d/lxc[.*]
.
I dont have any init.d script. I dont have lxc/lxd services. I've built from source.
but I found these files:
/etc/lxc# ll
total 8.0K
-rw-r--r-- 1 root root 23 Mar 28 2022 default.conf.apk-new
-rw-r--r-- 1 root root 91 Aug 5 15:56 default.conf
/etc/lxc# cat default.conf
lxc.net.0.type = empty
lxc.idmap = u 0 100000 1000000000
lxc.idmap = g 0 100000 1000000000
is it possible in this file?
I suggest you stick with the alpine packages now you know what the issue is.
Its covered in the alpine wiki https://wiki.alpinelinux.org/wiki/LXD
Yep see #11167 (comment) as if you start the lxc service (without any lxc containers) with that setting it will Mount the cgroups for you.
https://git.alpinelinux.org/aports/tree/main/lxc/lxc.initd?h=3.17-stable#n13
I missed this comment. thanks for the link. I can create my own openrc script.
thanks for your help.
when I do
lxc start instance08
I get no output so I assume it worked. butlxc ls instance08
shows state STOPPED. no logs, nothing.any hints?