canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 931 forks source link

Error while reading data: read unix @->@/tmp/.X11-unix/X0 #6584

Closed rajil closed 4 years ago

rajil commented 4 years ago

Required information


# lxc info
config:
  core.https_address: '[::]:8443'
  images.auto_update_interval: "0"
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- candid_authentication
- backup_compression
- candid_config
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- candid_config_key
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- rbac
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
environment:
  addresses:
  - 172.16.1.5:8443
  - '[fd42:1ad1:16d:6df5::1]:8443'
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    -----END CERTIFICATE-----
  driver: lxc
  driver_version: 3.2.1
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    shiftfs: "false"
    uevent_injection: "true"
    unpriv_fscaps: "true"
  kernel_version: 5.3.11-arch1-1
  lxc_features:
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    seccomp_notify: "true"
  project: default
  server: lxd
  server_clustered: false
  server_name: server
  server_pid: 11883
  server_version: "3.18"
  storage: zfs
  storage_version: 0.8.2-1

Issue description

I am trying to run a gui app in the container. Unfortunately, the abstract proxy is giving an error. The host is running Archlinux and the container is Ubuntu-18.04.

My container definition is as follows:

$ lxc config show mycontainer --expanded    
architecture: x86_64
config:
  environment.DISPLAY: :0
  environment.TZ: Asia/Kolkata
  image.architecture: amd64
  image.description: ubuntu 18.04 LTS amd64 (release) (20191205)
  image.label: release
  image.os: ubuntu
  image.release: bionic
  image.serial: "20191205"
  image.type: squashfs
  image.version: "18.04"
  nvidia.driver.capabilities: all
  nvidia.runtime: "true"
  raw.idmap: |
    uid 1001 1000
    gid 100 1000
  security.privileged: "false"
  user.user-data: |
    #cloud-config
    runcmd:
      - 'sed -i "s/; enable-shm = yes/enable-shm = no/g" /etc/pulse/client.conf'
    packages:
      - x11-apps
      - mesa-utils
  volatile.base_image: f75468c572cc50eca7f76391182e6fdaf58431f84c3d35a2c92e83814e701698
  volatile.eth0.host_name: vethcc27718a
  volatile.eth0.hwaddr: 00:16:3e:41:43:b7
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000},{"Isuid":true,"Isgid":false,"Hostid":1001,"Nsid":1000,"Maprange":1},{"Isuid":true,"Isgid":false,"Hostid":1001001,"Nsid":1001,"Maprange":999998999},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000},{"Isuid":false,"Isgid":true,"Hostid":100,"Nsid":1000,"Maprange":1},{"Isuid":false,"Isgid":true,"Hostid":1001001,"Nsid":1001,"Maprange":999998999}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000},{"Isuid":true,"Isgid":false,"Hostid":1001,"Nsid":1000,"Maprange":1},{"Isuid":true,"Isgid":false,"Hostid":1001001,"Nsid":1001,"Maprange":999998999},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000},{"Isuid":false,"Isgid":true,"Hostid":100,"Nsid":1000,"Maprange":1},{"Isuid":false,"Isgid":true,"Hostid":1001001,"Nsid":1001,"Maprange":999998999}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000},{"Isuid":true,"Isgid":false,"Hostid":1001,"Nsid":1000,"Maprange":1},{"Isuid":true,"Isgid":false,"Hostid":1001001,"Nsid":1001,"Maprange":999998999},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000},{"Isuid":false,"Isgid":true,"Hostid":100,"Nsid":1000,"Maprange":1},{"Isuid":false,"Isgid":true,"Hostid":1001001,"Nsid":1001,"Maprange":999998999}]'
  volatile.last_state.power: RUNNING
devices:
  X0:
    bind: container
    connect: unix:@/tmp/.X11-unix/X0
    listen: unix:@/tmp/.X11-unix/X0
    type: proxy
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  mygpu:
    productid: 1d01
    type: gpu
    vendorid: 10de
  port80:
    connect: tcp:127.0.0.1:80
    listen: tcp:0.0.0.0:80
    type: proxy
  root:
    path: /
    pool: lxd
    type: disk
ephemeral: false
profiles:
- default
- x11
stateful: false
description: ""

The container gives an error when trying to use X,

$ lxc exec mycontainer -- glxinfo -B
No protocol specified
Error: unable to open display :0

Also, the container proxy log shows an error,

# tail -f  /var/snap/lxd/common/lxd/logs/mycontainer/proxy.X0.log 
Status: Started
Warning: Error while reading data: read unix @->@/tmp/.X11-unix/X0: EOF

The X0 file does exist on the host,

# ls -la /tmp/.X11-unix/X0 
srwxrwxrwx 1 root root 0 Dec  9 20:11 /tmp/.X11-unix/X0
stgraber commented 4 years ago

Could be that your host isn't using the abstract unix socket at all, causing the connection error. Try removing the @ from the connect side of the proxy device.

rajil commented 4 years ago

I modified the devices to look like so,

devices:
  X0:
    bind: container
    connect: unix:/tmp/.X11-unix/X0
    listen: unix:@/tmp/.X11-unix/X0
    type: proxy

I am still get similar error in /var/snap/lxd/common/lxd/logs/mycontainer/proxy.X0.log when issuing glxinfo -B in the container,

Warning: Error while reading data: read unix @->/tmp/.X11-unix/X0: EOF

stgraber commented 4 years ago

That's odd, the error message should have changed.

stgraber commented 4 years ago

The error could also be coming from the fact that you're having the proxy connect to your X server as root rather than as your user, on most systems, this gets immediately rejected.

To get around that, you need to set security.uid and security.gid on your proxy device to match the uid and gid of the user that's running the graphical session on your system.

rajil commented 4 years ago

The non-root user on the host with the X session has UID of 1001 and GID of 100. Thus i modified the devices as follows,

devices:
  X0:
    bind: container
    connect: unix:/tmp/.X11-unix/X0
    listen: unix:@/tmp/.X11-unix/X0
    security.gid: "100"
    security.uid: "1001"
    type: proxy

I still get the same error when running glxinfo like so,

lxc exec mycontainer -- sudo -u ubuntu glxinfo -B

Also, tried it with abstract socket, connect: unix:@/tmp/.X11-unix/X0. Made no difference to the error, except '@' got added before /tmp. Warning: Error while reading data: read unix @->@/tmp/.X11-unix/X0: EOF

stgraber commented 4 years ago

Can you run strace -fF -p PID where PID is the PID of the forkproxy process? Then try running something that uses it. That may give a clearer error as to what's going on.

rajil commented 4 years ago

Following is the strace output,

# strace -fF -p 4127522
strace: deprecated option -F ignored
strace: Process 4127522 attached with 8 threads
[pid 4128511] futex(0xc00028d9c8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4127533] futex(0xc0000ee4c8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4127532] futex(0x2495878, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4127531] epoll_wait(7,  <unfinished ...>
[pid 4127530] epoll_pwait(5,  <unfinished ...>
[pid 4127529] futex(0x24959e0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4127528] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 4127522] futex(0x1f4da08, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4127531] <... epoll_wait resumed>[{EPOLLIN, {u32=4, u64=4}}], 10, -1) = 1
[pid 4127530] <... epoll_pwait resumed>[{EPOLLIN, {u32=2565914376, u64=139966960156424}}], 128, -1, NULL, 3) = 1
[pid 4127531] futex(0x1f4cd70, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 4127530] epoll_pwait(5,  <unfinished ...>
[pid 4127531] <... futex resumed>)      = 1
[pid 4127528] <... restart_syscall resumed>) = 0
[pid 4127531] accept4(6,  <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127531] <... accept4 resumed>{sa_family=AF_UNIX}, [112->2], SOCK_CLOEXEC|SOCK_NONBLOCK) = 8
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127531] epoll_ctl(5, EPOLL_CTL_ADD, 8, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2565914168, u64=139966960156216}} <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127531] <... epoll_ctl resumed>)  = 0
[pid 4127530] <... epoll_pwait resumed>[{EPOLLIN|EPOLLOUT, {u32=2565914168, u64=139966960156216}}], 128, -1, NULL, 3) = 1
[pid 4127531] getsockname(8,  <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127531] <... getsockname resumed>{sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X0"}, [112->20]) = 0
[pid 4127530] epoll_pwait(5,  <unfinished ...>
[pid 4127531] socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0 <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127531] <... socket resumed>)     = 9
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127531] setsockopt(9, SOL_SOCKET, SO_BROADCAST, [1], 4 <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127531] <... setsockopt resumed>) = 0
[pid 4127531] connect(9, {sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X0"}, 20 <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127531] <... connect resumed>)    = 0
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127531] epoll_ctl(5, EPOLL_CTL_ADD, 9, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2565913960, u64=139966960156008}} <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127531] <... epoll_ctl resumed>)  = 0
[pid 4127530] <... epoll_pwait resumed>[{EPOLLOUT, {u32=2565913960, u64=139966960156008}}], 128, -1, NULL, 3) = 1
[pid 4127531] getsockname(9,  <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127531] <... getsockname resumed>{sa_family=AF_UNIX}, [112->2]) = 0
[pid 4127530] epoll_pwait(5,  <unfinished ...>
[pid 4127531] getpeername(9,  <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127531] <... getpeername resumed>{sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X0"}, [112->20]) = 0
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127531] futex(0xc0000ee4c8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127533] <... futex resumed>)      = 0
[pid 4127531] <... futex resumed>)      = 1
[pid 4127533] nanosleep({tv_sec=0, tv_nsec=3000},  <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127533] <... nanosleep resumed>NULL) = 0
[pid 4127531] epoll_wait(7,  <unfinished ...>
[pid 4127533] futex(0xc00028d9c8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4128511] <... futex resumed>)      = 0
[pid 4127533] <... futex resumed>)      = 1
[pid 4128511] futex(0xc00028d9c8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4128511] <... futex resumed>)      = -1 EAGAIN (Resource temporarily unavailable)
[pid 4127533] futex(0xc00028d9c8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 4128511] nanosleep({tv_sec=0, tv_nsec=3000},  <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4128511] <... nanosleep resumed>NULL) = 0
[pid 4127533] <... futex resumed>)      = 0
[pid 4128511] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0 <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4128511] <... mmap resumed>)       = 0x7f4c7c000000
[pid 4127533] recvmsg(9,  <unfinished ...>
[pid 4128511] munmap(0x7f4c80000000, 67108864 <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4128511] <... munmap resumed>)     = 0
[pid 4127533] <... recvmsg resumed>{msg_namelen=112}, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 4128511] mprotect(0x7f4c7c000000, 135168, PROT_READ|PROT_WRITE <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4128511] <... mprotect resumed>)   = 0
[pid 4127533] futex(0xc0000ee4c8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4128511] rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1],  <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4128511] <... rt_sigprocmask resumed>[], 8) = 0
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4128511] mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0 <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4128511] <... mmap resumed>)       = 0x7f4c90110000
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4128511] mprotect(0x7f4c90110000, 4096, PROT_NONE <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4128511] <... mprotect resumed>)   = 0
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4128511] clone(child_stack=0x7f4c9090fcb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
strace: Process 474225 attached
[pid 4128511] <... clone resumed>, parent_tid=[474225], tls=0x7f4c90910700, child_tidptr=0x7f4c909109d0) = 474225
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 474225] set_robust_list(0x7f4c909109e0, 24 <unfinished ...>
[pid 4128511] rt_sigprocmask(SIG_SETMASK, [],  <unfinished ...>
[pid 474225] <... set_robust_list resumed>) = 0
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 474225] sigaltstack(NULL,  <unfinished ...>
[pid 4128511] <... rt_sigprocmask resumed>NULL, 8) = 0
[pid 474225] <... sigaltstack resumed>{ss_sp=NULL, ss_flags=SS_DISABLE, ss_size=0}) = 0
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 474225] sigaltstack({ss_sp=0xc000338000, ss_flags=0, ss_size=32768},  <unfinished ...>
[pid 4128511] recvmsg(8,  <unfinished ...>
[pid 474225] <... sigaltstack resumed>NULL) = 0
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 474225] rt_sigprocmask(SIG_SETMASK, [],  <unfinished ...>
[pid 4128511] <... recvmsg resumed>{msg_name=0xc00031eac8, msg_namelen=112->0, msg_iov=[{iov_base="l\0\v\0\0\0\0\0\0\0\0\0", iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 12
[pid 474225] <... rt_sigprocmask resumed>NULL, 8) = 0
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 474225] gettid( <unfinished ...>
[pid 4128511] sendmsg(9, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\0\v\0\0\0\0\0\0\0\0\0", iov_len=12}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
[pid 474225] <... gettid resumed>)      = 474225
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 474225] futex(0xc000334148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4128511] <... sendmsg resumed>)    = 12
[pid 4127530] <... epoll_pwait resumed>[{EPOLLOUT, {u32=2565913960, u64=139966960156008}}], 128, -1, NULL, 3) = 1
[pid 4128511] recvmsg(8,  <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4128511] <... recvmsg resumed>{msg_namelen=112}, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid 4127530] epoll_pwait(5,  <unfinished ...>
[pid 4128511] futex(0xc00028d9c8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127530] <... epoll_pwait resumed>[{EPOLLIN|EPOLLOUT|EPOLLHUP|EPOLLRDHUP, {u32=2565913960, u64=139966960156008}}], 128, -1, NULL, 3) = 1
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127530] recvmsg(9, {msg_name={sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X0"}, msg_namelen=112->20, msg_iov=[{iov_base="\0\26\v\0\0\0\6\0No protocol specified\n\0\0", iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 32
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127530] sendmsg(8, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0\26\v\0\0\0\6\0No protocol specified\n\0\0", iov_len=32}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0 <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127530] <... sendmsg resumed>)    = 32
[pid 4127530] recvmsg(9,  <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127530] <... recvmsg resumed>{msg_name=0xc000322ac8, msg_namelen=112->0, msg_iov=[{iov_base="", iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 0
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127530] futex(0xc00028d9c8, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 4128511] <... futex resumed>)      = 0
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127530] write(1, "Warning: Error while reading dat"..., 72 <unfinished ...>
[pid 4128511] epoll_pwait(5,  <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4128511] <... epoll_pwait resumed>[{EPOLLIN|EPOLLOUT|EPOLLHUP|EPOLLRDHUP, {u32=2565914168, u64=139966960156216}}], 128, 0, NULL, 0) = 1
[pid 4127530] <... write resumed>)      = 72
[pid 4128511] futex(0xc000334148, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 474225] <... futex resumed>)       = 0
[pid 4128511] <... futex resumed>)      = 1
[pid 4127530] epoll_ctl(5, EPOLL_CTL_DEL, 9, 0xc000316d9c <unfinished ...>
[pid 474225] epoll_pwait(5,  <unfinished ...>
[pid 4128511] recvmsg(8,  <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 474225] <... epoll_pwait resumed>[], 128, 0, NULL, 0) = 0
[pid 4128511] <... recvmsg resumed>{msg_name=0xc00031eac8, msg_namelen=112->0, msg_iov=[{iov_base="", iov_len=4096}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 0
[pid 4127530] <... epoll_ctl resumed>)  = 0
[pid 474225] epoll_pwait(5,  <unfinished ...>
[pid 4128511] futex(0xc00028d9c8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127530] close(9 <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127530] <... close resumed>)      = 0
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127530] epoll_ctl(5, EPOLL_CTL_DEL, 8, 0xc000316d9c <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127530] <... epoll_ctl resumed>)  = 0
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127530] close(8 <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4127530] <... close resumed>)      = 0
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4127530] futex(0xc00028d9c8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
[pid 4128511] <... futex resumed>)      = 0
[pid 4127530] <... futex resumed>)      = 1
[pid 4128511] nanosleep({tv_sec=0, tv_nsec=3000},  <unfinished ...>
[pid 4127528] <... nanosleep resumed>NULL) = 0
[pid 4128511] <... nanosleep resumed>NULL) = 0
[pid 4127530] futex(0xc0000bcbc8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4128511] futex(0xc00028d9c8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 4127528] nanosleep({tv_sec=0, tv_nsec=20000}, NULL) = 0
[pid 4127528] futex(0x1f4cd70, FUTEX_WAIT_PRIVATE, 0, {tv_sec=60, tv_nsec=0}
stgraber commented 4 years ago

So the above shows the No protocol specified being transferred over the unix socket, suggesting that it is in fact connected to something.

Do you get the same result running xvinfo?

rajil commented 4 years ago

Yes, i get the same result with xvinfo.

stgraber commented 4 years ago

Can you look at your xserver log see if maybe something is logged there?

stgraber commented 4 years ago

You can also try running "xhost +" on your host, that should eliminate any potential authentication problems

rajil commented 4 years ago

Ok, we are getting somewhere with xhost +. After issuing that glxinfo is working,

$ lxc exec mycontainer -- sudo -u ubuntu glxinfo -B|grep -i render
direct rendering: Yes
OpenGL renderer string: GeForce GT 1030/PCIe/SSE2

How i still get a warning in /var/snap/lxd/common/lxd/logs/mycontainer/proxy.X0.log when the above command is issued, Warning: Error while sending data: read unix @/tmp/.X11-unix/X0->@: EOF

I did an strace on the forkproxy pid and the results are here.

stgraber commented 4 years ago

Yeah, that's probably fine, it's really just a warning that the connection got closed partway through a read, it's not a problem by itself and I've certainly seen it happen before. X is a bit of an odd protocol as it's not really just the single socket, a number of other files are also passed through using scm rights packets, that's especially true for GL workloads, so if things work, and looks like they do, I wouldn't worry about the connection getting disconnected somewhat abruptly by one of the two sides.

Closing as it sounds like the issue was around X authentication and not something to do with LXD. Most distros allow X connections by the right user/group but it looks like your system may be entirely dependent on Xauthority instead which would explain why you had to use the xhost + trick. An alternative would probably be for you to transfer the Xauthority token/file into the container and set the XAUTHORITY env variable accordingly, though just keeping using xhost is fine too, so long as your X server isn't exposed to the network (I don't know of any distro which still does that).