canonical / ubuntu-desktop-installer

Ubuntu Desktop Installer
GNU General Public License v3.0
535 stars 94 forks source link

Noble installer is an empty white window #2391

Closed vanvugt closed 9 months ago

vanvugt commented 9 months ago

What happened?

Noble installer (20231119) is an empty white window:

Screenshot from 2023-11-20 15-14-59

Confirmed on two laptops and in a VM.

What was expected?

That I can see the installer window.

Steps to reproduce

Boot https://cdimage.ubuntu.com/daily-live/20231119/noble-desktop-amd64.iso

DimitryAndric commented 9 months ago

I see this too, on a VMware guest with https://cdimage.ubuntu.com/daily-live/current/noble-desktop-amd64.iso as of 2023-11-20.

It looks like there is an issue with the core22 snap's libc.so.6 not providing a versioned symbol that is required by /lib/x86_64-linux-gnu/libelf.so.1:

buntu@ubuntu:~$ ubuntu-desktop-installer

(ubuntu_desktop_installer:4211): Gtk-WARNING **: 16:16:45.513: /usr/lib/x86_64-linux-gnu/gtk-3.0/3.0.0/immodules/im-ibus.so: undefined symbol: ibus_input_context_set_post_process_key_event

(ubuntu_desktop_installer:4211): Gtk-WARNING **: 16:16:45.513: Loading IM context type 'ibus' failed
libGL error: MESA-LOADER: failed to open vmwgfx: /snap/core22/current/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /lib/x86_64-linux-gnu/libelf.so.1) (search paths /snap/ubuntu-desktop-installer/1272/usr/lib/x86_64-linux-gnu/dri, suffix _dri)
libGL error: failed to load driver: vmwgfx
libGL error: MESA-LOADER: failed to open swrast: /snap/core22/current/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /lib/x86_64-linux-gnu/libelf.so.1) (search paths /snap/ubuntu-desktop-installer/1272/usr/lib/x86_64-linux-gnu/dri, suffix _dri)
libGL error: failed to load driver: swrast

** (ubuntu_desktop_installer:4211): WARNING **: 16:16:47.194: Failed to start Flutter renderer: Unable to create a GL context

I also see a similar error on an arm64 qemu instance:

ubuntu@ubuntu:~$ ubuntu-desktop-installer

(ubuntu_desktop_installer:108493): Gtk-WARNING **: 16:21:33.469: /usr/lib/aarch64-linux-gnu/gtk-3.0/3.0.0/immodules/im-ibus.so: undefined symbol: ibus_input_context_set_post_process_key_event

(ubuntu_desktop_installer:108493): Gtk-WARNING **: 16:21:33.470: Loading IM context type 'ibus' failed
libGL error: MESA-LOADER: failed to open virtio_gpu: /snap/core22/current/lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /lib/aarch64-linux-gnu/libelf.so.1) (search paths /snap/ubuntu-desktop-installer/1273/usr/lib/aarch64-linux-gnu/dri, suffix _dri)
libGL error: failed to load driver: virtio_gpu
libGL error: MESA-LOADER: failed to open virtio_gpu: /snap/core22/current/lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /lib/aarch64-linux-gnu/libelf.so.1) (search paths /snap/ubuntu-desktop-installer/1273/usr/lib/aarch64-linux-gnu/dri, suffix _dri)
libGL error: failed to load driver: virtio_gpu
libGL error: MESA-LOADER: failed to open swrast: /snap/core22/current/lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by /lib/aarch64-linux-gnu/libelf.so.1) (search paths /snap/ubuntu-desktop-installer/1273/usr/lib/aarch64-linux-gnu/dri, suffix _dri)
libGL error: failed to load driver: swrast

** (ubuntu_desktop_installer:108493): WARNING **: 16:21:33.485: Failed to start Flutter renderer: Unable to create a GL context

** (ubuntu_desktop_installer:108493): WARNING **: 16:21:58.532: atk-bridge: get_device_events_reply: unknown signature
DimitryAndric commented 9 months ago

It's interesting that libelf.so.1 is loaded from the root filesystem, while libc.so.6 is loaded from the core22 snap. Obviously the libraries in the root filesystem are linked against a newer version of glibc, so that would explain the missing versioned symbol.

As to how to fix this, I would guess libelf.so.1 might be shipped in the core22 snap? But that would probably have to be reported in whatever bug tracker the core snaps use...

mwhudson commented 9 months ago

The snapped installer must indeed not load any shared objects from the root filesystem. Maybe some more packages need to be staged into the ubuntu-desktop-installer snap? I thought snapcraft would complain about this but are objects being loaded by dlopen perhaps?

vanvugt commented 9 months ago

Yes it must be dlopen because otherwise the process wouldn't be getting as far as displaying a blank window.

seb128 commented 9 months ago

The snapped installer must indeed not load any shared objects from the root filesystem. Maybe some more packages need to be staged into the ubuntu-desktop-installer snap? I thought snapcraft would complain about this but are objects being loaded by dlopen perhaps?

The snap is classic which makes it a bit more complicated. It seems the issue there is because the dri drivers are included using 'no-patchelf' (details of why that's needed in https://forum.snapcraft.io/t/caveats-for-no-patchelf-in-a-classic-snap).

A consequence is that the .so files in the dri directory don't get a rpath set on to point them to the core version of the librairies

$ ldd -r /snap/ubuntu-desktop-installer/current/usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
...
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7c93600000)
...
    libelf.so.1 => /lib/x86_64-linux-gnu/libelf.so.1 (0x00007fb646024000)

It seems it does end up loading the core22 libc6 though (probably because it's already loaded earlier in the process and the installer binary has the correct rpath) but then the libelf1 from the system...

kenvandine commented 9 months ago

libelf1

If so then simply staging libelf1 will fix it.

seb128 commented 9 months ago

libelf1

If so then simply staging libelf1 will fix it.

Right, it does and we should probably do that for now but it would be better to use the libelf already provided by core22 rather than duplicating in the installer. Adding /snap/core22/current/usr/lib/x86_64-linux-gnu to LD_LIBRARY_PATH is making the process crash though for some reason...

seb128 commented 9 months ago

I submitted a PR to stage libelf now, https://github.com/canonical/ubuntu-desktop-installer/pull/2397

DimitryAndric commented 9 months ago

If so then simply staging libelf1 will fix it.

Right, it does and we should probably do that for now but it would be better to use the libelf already provided by core22 rather than duplicating in the installer.

Yes, that is something I also noticed: core22 does contain a libelf.so.1, so why isn't the installer snap using that? Is the rpath in the dri drivers the actual problem? (That said, I guess it is expected that Xorg will try to dlopen the dri drivers.)

seb128 commented 9 months ago

If so then simply staging libelf1 will fix it.

Right, it does and we should probably do that for now but it would be better to use the libelf already provided by core22 rather than duplicating in the installer.

Yes, that is something I also noticed: core22 does contain a libelf.so.1, so why isn't the installer snap using that? Is the rpath in the dri drivers the actual problem? (That said, I guess it is expected that Xorg will try to dlopen the dri drivers.)

The issue is that without setting a rpath (which snapcraft is doing without the no-patchelf hack needed for dri drivers) the default paths resolution is being used, which means it tries to load the system version...

seb128 commented 9 months ago

The fix landed in stable now

vanvugt commented 9 months ago

Confirmed fixed in https://cdimage.ubuntu.com/daily-live/20231122/

The installer now gets halfway through before crashing, but compared to this bug it's a big step forward.

DimitryAndric commented 9 months ago

Yep, here too. The installer starts up fine now, at least. It indeed crashes right after filling in a username. :)

seb128 commented 9 months ago

Confirmed fixed in https://cdimage.ubuntu.com/daily-live/20231122/

The installer now gets halfway through before crashing, but compared to this bug it's a big step forward.

The issue is different from this report, let's use https://bugs.launchpad.net/subiquity/+bug/2044252

Nov 22 10:18:13 ubuntu subiquity_log.1732[4676]: E: Clearsigned file '/var/lib/apt/lists/partial/_cdrom_dists_noble_Release' contains unsigned lines.
avc94 commented 9 months ago

The issue is different from this report, let's use https://bugs.launchpad.net/subiquity/+bug/2044252

Nov 22 10:18:13 ubuntu subiquity_log.1732[4676]: E: Clearsigned file '/var/lib/apt/lists/partial/_cdrom_dists_noble_Release' contains unsigned lines.

link to the launchpad isn't working for me (page doesn't exist error) P.S. seems like installer crashes in few seconds even if I'm not typing a username

seb128 commented 9 months ago

The issue is different from this report, let's use https://bugs.launchpad.net/subiquity/+bug/2044252

link to the launchpad isn't working for me (page doesn't exist error) P.S. seems like installer crashes in few seconds even if I'm not typing a username

The bug is private but we will use https://bugs.launchpad.net/subiquity/+bug/2044239 instead which is public. And yes, the issue has to do with the process which is doing the installation in the background and not with the username step