pstch commented 4 years ago

Required information

Distribution: NixOS
Distribution version: nixpkgs-unstable

The output of "lxc info" or if that fails:

config: {}
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- candid_authentication
- backup_compression
- candid_config
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- candid_config_key
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- rbac
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
addresses: []
architectures:
- x86_64
- i686
certificate: ...
certificate_fingerprint: ...
driver: lxc
driver_version: 3.2.1
kernel: Linux
kernel_architecture: x86_64
kernel_features:
netnsid_getifaddrs: "false"
seccomp_listener: "false"
shiftfs: "false"
uevent_injection: "true"
unpriv_fscaps: "true"
kernel_version: 4.19.84
lxc_features:
mount_injection_file: "true"
network_gateway_device_route: "true"
network_ipvlan: "true"
network_l2proxy: "true"
network_phys_macvlan_mtu: "true"
seccomp_notify: "true"
project: default
server: lxd
server_clustered: false
server_name: tinix
server_pid: 30401
server_version: "3.18"
storage: zfs
storage_version: 0.8.2-1

Issue description

When using ZFS with LXD 3.18, I cannot create new containers (Unable to unpack image, run out of disk space), although I have plenty of disk space available.

Steps to reproduce

Install LXD 3.18
Create a ZFS dataset (zfs create rpool/test)
Run lxd init, using the created dataset as the new storage pool
Run lxc launch debian/9

The same steps work with LXD 3.13.

Information to attach

[X] Any relevant kernel output (dmesg) No relevant info
[X] Container log (lxc info NAME --show-log) No container created
[X] Container configuration (lxc config show NAME --expanded) Default profile untouched
[X] Main daemon log (at /var/log/lxd/lxd.log or /var/snap/lxd/common/lxd/logs/lxd.log) Nothing logged during container creation
[X] Output of the client with --debug, and lxc monitor while reproducing the issue : https://gist.github.com/pstch/9f8835f2d00c3554579069c4e13acf8a

[X] LXD preseed :

config: {}
networks: []
storage_pools:
- config:
source: tinix-main/LXD
description: ""
name: default
driver: zfs
profiles:
- config: {}
description: ""
devices:
root:
  path: /
  pool: default
  type: disk
name: default
cluster: null

stgraber commented 4 years ago

Can you show:

df -h
df -i
zfs list -t all

stgraber commented 4 years ago

Ah and lxc storage show default too

pstch commented 4 years ago

[root@tinix:~]# lxc storage show default
config:
  source: tinix-main/LXD
  volatile.initial_source: tinix-main/LXD
  zfs.pool_name: tinix-main/LXD
description: ""
name: default
driver: zfs
used_by:
- /1.0/profiles/default
status: Created
locations:
- none

[root@tinix:~]# df -h
Filesystem              Size  Used Avail Use% Mounted on
devtmpfs                198M     0  198M   0% /dev
tmpfs                   2.0G     0  2.0G   0% /dev/shm
tmpfs                   988M  4.1M  984M   1% /run
tmpfs                   2.0G  384K  2.0G   1% /run/wrappers
tinix-main/SYS/nixos-1   19G  7.7G   12G  41% /
tmpfs                   2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/sda2               488M   48M  405M  11% /boot
tmpfs                   395M     0  395M   0% /run/user/0
tmpfs                   395M     0  395M   0% /run/user/1000
tmpfs                   100K     0  100K   0% /var/lib/lxd/shmounts
tmpfs                   100K     0  100K   0% /var/lib/lxd/devlxd

[root@tinix:~]# df -i
Filesystem               Inodes  IUsed    IFree IUse% Mounted on
devtmpfs                 503435    402   503033    1% /dev
tmpfs                    505604      1   505603    1% /dev/shm
tmpfs                    505604    969   504635    1% /run
tmpfs                    505604     35   505569    1% /run/wrappers
tinix-main/SYS/nixos-1 23681654 444630 23237024    2% /
tmpfs                    505604     18   505586    1% /sys/fs/cgroup
/dev/sda2                 32768    340    32428    2% /boot
tmpfs                    505604      4   505600    1% /run/user/0
tmpfs                    505604      4   505600    1% /run/user/1000
tmpfs                    505604      1   505603    1% /var/lib/lxd/shmounts
tmpfs                    505604      2   505602    1% /var/lib/lxd/devlxd

[root@tinix:~]# zfs list -t all
NAME                              USED  AVAIL     REFER  MOUNTPOINT
tinix-cold                        876K  61.5G       96K  none
tinix-main                       7.69G  11.1G      192K  none
tinix-main/LXD                   1.31M  11.1G      192K  none
tinix-main/LXD/containers         192K  11.1G      192K  none
tinix-main/LXD/custom             192K  11.1G      192K  none
tinix-main/LXD/custom-snapshots   192K  11.1G      192K  none
tinix-main/LXD/deleted            192K  11.1G      192K  none
tinix-main/LXD/images             192K  11.1G      192K  none
tinix-main/LXD/snapshots          192K  11.1G      192K  none
tinix-main/SYS                   7.66G  11.1G      192K  none
tinix-main/SYS/nixos-1           7.66G  11.1G     7.66G  /

stgraber commented 4 years ago

Not sure how easy it is for you to rebuild LXD with a custom patch, but if it's easy, then the following should help:

diff --git a/shared/archive_linux.go b/shared/archive_linux.go
index 7bd0ff438..cf6d0a0f4 100644
--- a/shared/archive_linux.go
+++ b/shared/archive_linux.go
@@ -87,9 +87,9 @@ func Unpack(file string, path string, blockBackend bool, runningInUserns bool, t
                // Check if we're running out of space
                if int64(fs.Bfree) < int64(2*fs.Bsize) {
                        if blockBackend {
-                               return fmt.Errorf("Unable to unpack image, run out of disk space (consider increasing your pool's volume.size)")
+                               logger.Errorf("Unable to unpack image, run out of disk space (consider increasing your pool's volume.size)")
                        } else {
-                               return fmt.Errorf("Unable to unpack image, run out of disk space")
+                               logger.Errorf("Unable to unpack image, run out of disk space")
                        }
                }

stgraber commented 4 years ago

That will expose the error as it's coming out of the unpacker.

pstch commented 4 years ago

With this patch, the error message I get is:

Error: Failed instance creation: Create container from image: Unpack failed, Failed to run: unsquashfs -f -d /var/lib/lxd/storage-pools/default/images/872160240/rootfs -n /var/lib/lxd/images/4998508e7172a87489f2eb1c1c871cab280b7b96db850f0c486da599631491de.rootfs: FATAL ERROR:write_file: failed to create file /var/lib/lxd/storage-pools/default/images/872160240/rootfs/usr/lib/x86_64-linux-gnu/gconv/EBCDIC-AT-DE.so, because Too many open files.

Here are the current limits used:

[root@tinix:~]# cat /proc/$(pidof lxd)/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             15732                15732                processes
Max open files            1024                 524288               files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       15732                15732                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

It seems indeed that 3.18 opens more files (~969) than 3.13 (~650), and this can become a problem when the steps in Production Setup have not yet been applied.

I suppose the only thing to really fix is the confusing error message, and maybe users should be notified that the nofile limit must now be raised even to create a single container (at least in some configurations), and that this issue can otherwise be closed.

stgraber commented 4 years ago

This is most likely a squashfs bug though rather than LXD. It sounds like unsquashfs is keeping a lot of files open as it's uncompressing which seems wrong.

I'll take a look at the out of space logic though as it sure shouldn't have triggered when there's plenty of space left...

Cynerd commented 4 years ago

I can most probably confirm that is is problem with unsqashfs. With squashfs-tools 4.4 I can't create Debian container (have not tested any other). When squashfs-tools were downgraded to 4.3 it works without a hitch.

canonical / lxd

Using ZFS, cannot create new containers on LXD 3.18 #6556

Required information

Issue description

Steps to reproduce

Information to attach