canonical / ubuntu-frame

The foundation for many embedded graphical display implementations
GNU General Public License v3.0
156 stars 21 forks source link

Display is not resumed after power cycling monitor #160

Open jrmcpeek opened 9 months ago

jrmcpeek commented 9 months ago

We are running Ubuntu Core 22 on a device with a touchscreen. Currently, we've got the following snap environment:

core22            20230801                864    latest/stable  canonical✓  base
mesa-core22       23.0.4                  231    latest/stable  canonical✓  -
network-manager   1.36.6-8                873    22/stable      canonical✓  -
pc-kernel         5.15.0-89.99.2          1517   22/stable      canonical✓  kernel
snapd             2.60.4                  20290  latest/stable  canonical✓  snapd
ubuntu-frame      118-mir2.15.0           7025   22/stable      canonical✓  -
ubuntu-frame-osk  49-squeekboard-v1.17.1  372    22/stable      canonical✓  -
ubuntu-frame-vnc  45-wayvncv0.6.2         226    22/stable      canonical✓  -

We're receiving a report from the folks who have the device at their location that shut the device off at night, and then when they turn it on some ~16 hours later, everything boots up like you'd expect, but the touchscreen doesn't work.

While trying to assist them today, this is what I encountered:

I attempted to initiate a VNC session, which would connected but only present me with a blank screen. A few times it transitioned to a greyscale background, but mostly blank.

Because that did not work, I logged in to the device via SSH. There were no faults logged from our onboard software.

I had the following logged messages from ubuntu-frame-vnc:

Dec 01 05:47:09 machine-g6fn30100eem ubuntu-frame-vnc.daemon[507507]: error marshalling arguments for capture_output (signature nio): null value passed for arg 2
Dec 01 05:47:09 machine-g6fn30100eem ubuntu-frame-vnc.daemon[507507]: Error marshalling request: Invalid argument

I performed a snap restart ubuntu-frame-vnc.

After performing this, I attempted to make a VNC session again, which connected and displayed the screen.

Over VNC, I was able to interact with the buttons in our software, to confirm that our software was not just frozen / ignoring inputs.

Once I confirmed this, I performed a snap restart ubuntu-frame.

The display server restarted and our onboard software came back up. The user logged in again, at which point, the touchscreen functionality worked as expected.

Per their claims, this happens "every morning when they turn it on". Obviously in this specific case, it was not a matter of a cold boot event, but perhaps driven from the snap refresh of ubuntu-frame itself?

In all cases, they are able to workaround this issue by pulling the plug on the power cable and plugging it back in.

I have attached the logs related to ubuntu-frame-* here. The touchscreen device is ILITEK-2911.

ubuntu-frame-log.txt

Any thoughts?

AlanGriffiths commented 9 months ago

Hi, the touchscreen issues sounds a lot like https://github.com/MirServer/mir/issues/3149. Could you check whether that problem persists after switching Frame to the 22/candidate channel?

The issue with VNC sounds unrelated, and isn't something I've encountered. What client software were you using?

jrmcpeek commented 9 months ago

I will check on 22/candidate and let you know.

I'm trying to reproduce locally with 22/beta and if I turn off the monitor, wait 30 seconds, and turn it back on, the display doesn't come back at all.

What client software were you using?

https://guacamole.apache.org/

jrmcpeek commented 9 months ago

All right. I'm stuck because I cannot get the display to present itself after power cycling the monitor.

It is the same monitor between both environments, both connected via a USB-C to DisplayPort cable.

It's not clear why the remote one cycles properly but the local one does not recover.

Based on the previous go around with a different bug, I've set debug logging via:

snap set ubuntu-frame config="env-hacks=WAYLAND_DEBUG=server"

With the following snaps:

core22            20230801                864    latest/stable  canonical✓  base
mesa-core22       23.0.4                  231    latest/stable  canonical✓  -
network-manager   1.36.6-8                873    22/stable      canonical✓  -
pc-kernel         5.15.0-89.99.2          1517   22/stable      canonical✓  kernel
snapd             2.60.4                  20290  latest/stable  canonical✓  snapd
ubuntu-frame      118-mir2.15.0           7025   22/stable      canonical✓  -
ubuntu-frame-osk  49-squeekboard-v1.17.1  372    22/stable      canonical✓  -
ubuntu-frame-vnc  45-wayvncv0.6.2         226    22/stable      canonical✓  -

ubuntu-frame-stable-log.txt

Key events:

Result:

With the following snaps:

core22            20230801                864    latest/stable  canonical✓  base
mesa-core22       23.0.4                  231    latest/stable  canonical✓  -
network-manager   1.36.6-8                873    22/stable      canonical✓  -
pc-kernel         5.15.0-89.99.2          1517   22/stable      canonical✓  kernel
snapd             2.60.4                  20290  latest/stable  canonical✓  snapd
ubuntu-frame      123-mir2.16.0           7931   22/candidate   canonical✓  -
ubuntu-frame-osk  49-squeekboard-v1.17.1  372    22/stable      canonical✓  -
ubuntu-frame-vnc  45-wayvncv0.6.2         226    22/stable      canonical✓  -

ubuntu-frame-candidate-log.txt

Key events:

Result:

Saviq commented 9 months ago

Hi @jrmcpeek, with the monitor problems, can you check the output of drm_info (or graphics-test-tools.drm-info) when the monitor is working, off and back on, not working?

jrmcpeek commented 9 months ago

@Saviq - As requested.

Using the following snaps:

core22               20230801                864    latest/stable  canonical✓  base
graphics-test-tools  22.04                   268    22/stable      canonical✓  -
mesa-core22          23.0.4                  231    latest/stable  canonical✓  -
network-manager      1.36.6-8                873    22/stable      canonical✓  -
pc-kernel            5.15.0-89.99.2          1517   22/stable      canonical✓  kernel
snapd                2.60.4                  20290  latest/stable  canonical✓  snapd
ubuntu-frame         118-mir2.15.0           7025   22/stable      canonical✓  -
ubuntu-frame-osk     49-squeekboard-v1.17.1  372    22/stable      canonical✓  -
ubuntu-frame-vnc     45-wayvncv0.6.2         226    22/stable      canonical✓  -

I've attached the following logs:

Surprisingly (to me) the files drm-info-working.txt and drm-info-off.txt are identical.

Here are the differences between drm-info-working.txt and drm-info-not-working.txt:

--- drm-info-working.txt
+++ drm-info-not-working.txt
@@ -111,58 +111,16 @@
 │   ├───Connector 4
 │   │   ├───Object ID: 269
 │   │   ├───Type: DisplayPort
-│   │   ├───Status: connected
-│   │   ├───Physical size: 530x300 mm
-│   │   ├───Subpixel: unknown
+│   │   ├───Status: disconnected
 │   │   ├───Encoders: {12}
-│   │   ├───Modes
-│   │   │   ├───1920x1080@60.00 preferred driver phsync pvsync 
-│   │   │   ├───1920x1080@74.97 driver phsync nvsync 
-│   │   │   ├───1920x1080@60.00 driver phsync pvsync 16:9 
-│   │   │   ├───1920x1080@59.94 driver phsync pvsync 16:9 
-│   │   │   ├───1920x1080@50.00 driver phsync pvsync 16:9 
-│   │   │   ├───1600x1200@70.00 driver phsync pvsync 
-│   │   │   ├───1600x1200@60.00 driver phsync pvsync 
-│   │   │   ├───1680x1050@59.95 driver nhsync pvsync 
-│   │   │   ├───1280x1024@75.03 driver phsync pvsync 
-│   │   │   ├───1280x1024@60.02 driver phsync pvsync 
-│   │   │   ├───1440x900@74.98 driver nhsync pvsync 
-│   │   │   ├───1440x900@59.89 driver nhsync pvsync 
-│   │   │   ├───1366x768@59.79 driver phsync pvsync 
-│   │   │   ├───1280x800@59.81 driver nhsync pvsync 
-│   │   │   ├───1152x864@75.00 driver phsync pvsync 
-│   │   │   ├───1280x768@59.87 driver phsync pvsync 
-│   │   │   ├───1280x720@60.00 driver phsync pvsync 16:9 
-│   │   │   ├───1280x720@59.94 driver phsync pvsync 16:9 
-│   │   │   ├───1280x720@50.00 driver phsync pvsync 16:9 
-│   │   │   ├───1024x768@75.03 driver phsync pvsync 
-│   │   │   ├───1024x768@70.07 driver nhsync nvsync 
-│   │   │   ├───1024x768@60.00 driver nhsync nvsync 
-│   │   │   ├───800x600@75.00 driver phsync pvsync 
-│   │   │   ├───800x600@72.19 driver phsync pvsync 
-│   │   │   ├───800x600@60.32 driver phsync pvsync 
-│   │   │   ├───720x576@50.00 driver nhsync nvsync 16:9 
-│   │   │   ├───720x576@50.00 driver nhsync nvsync 4:3 
-│   │   │   ├───720x480@60.00 driver nhsync nvsync 16:9 
-│   │   │   ├───720x480@60.00 driver nhsync nvsync 4:3 
-│   │   │   ├───720x480@59.94 driver nhsync nvsync 16:9 
-│   │   │   ├───720x480@59.94 driver nhsync nvsync 4:3 
-│   │   │   ├───640x480@75.00 driver nhsync nvsync 
-│   │   │   ├───640x480@72.81 driver nhsync nvsync 
-│   │   │   ├───640x480@66.67 driver nhsync nvsync 
-│   │   │   ├───640x480@60.00 driver nhsync nvsync 4:3 
-│   │   │   ├───640x480@59.94 driver nhsync nvsync 
-│   │   │   ├───640x480@59.94 driver nhsync nvsync 4:3 
-│   │   │   ├───720x400@70.08 driver nhsync pvsync 
-│   │   │   └───640x350@70.10 driver phsync nvsync 
 │   │   └───Properties
-│   │       ├───"EDID" (immutable): blob = 287
+│   │       ├───"EDID" (immutable): blob = 0
 │   │       ├───"DPMS": enum {On, Standby, Suspend, Off} = On
 │   │       ├───"link-status": enum {Good, Bad} = Good
 │   │       ├───"non-desktop" (immutable): range [0, 1] = 0
 │   │       ├───"TILE" (immutable): blob = 0
 │   │       ├───"CRTC_ID" (atomic): object CRTC = 80
-│   │       ├───"subconnector" (immutable): enum {Unknown, VGA, DVI-D, HDMI, DP, Wireless, Native} = Native
+│   │       ├───"subconnector" (immutable): enum {Unknown, VGA, DVI-D, HDMI, DP, Wireless, Native} = Unknown
 │   │       ├───"audio": enum {force-dvi, off, auto, on} = auto
 │   │       ├───"Broadcast RGB": enum {Automatic, Full, Limited 16:235} = Automatic
 │   │       ├───"max bpc": range [6, 12] = 12
@@ -394,14 +352,9 @@
     │   │   └───XVYU16161616 (0x38345658)
     │   └───Properties
     │       ├───"type" (immutable): enum {Overlay, Primary, Cursor} = Primary
-    │       ├───"FB_ID" (atomic): object framebuffer = 290
-    │       │   ├───Object ID: 290
-    │       │   ├───Size: 1920x1080
-    │       │   ├───Pitch: 7680
-    │       │   ├───Bits per pixel: 32
-    │       │   └───Depth: 24
+    │       ├───"FB_ID" (atomic): object framebuffer = 0
     │       ├───"IN_FENCE_FD" (atomic): srange [-1, INT32_MAX] = -1
-    │       ├───"CRTC_ID" (atomic): object CRTC = 80
+    │       ├───"CRTC_ID" (atomic): object CRTC = 0
     │       ├───"CRTC_X" (atomic): srange [INT32_MIN, INT32_MAX] = 0
     │       ├───"CRTC_Y" (atomic): srange [INT32_MIN, INT32_MAX] = 0
     │       ├───"CRTC_W" (atomic): range [0, INT32_MAX] = 1920

And here are the differences between drm-info-working.txt and drm-info-restart-ubuntu-frame.txt:

--- drm-info-working.txt
+++ drm-info-restart-ubuntu-frame.txt
@@ -309,7 +309,7 @@
 │   │   ├───Mode: 1920x1080@59.94 driver phsync pvsync 
 │   │   └───Properties
 │   │       ├───"ACTIVE" (atomic): range [0, 1] = 1
-│   │       ├───"MODE_ID" (atomic): blob = 289
+│   │       ├───"MODE_ID" (atomic): blob = 290
 │   │       │   └───1920x1080@59.94 driver phsync pvsync 
 │   │       ├───"OUT_FENCE_PTR" (atomic): range [0, UINT64_MAX] = 0
 │   │       ├───"VRR_ENABLED": range [0, 1] = 0
@@ -394,8 +394,8 @@
     │   │   └───XVYU16161616 (0x38345658)
     │   └───Properties
     │       ├───"type" (immutable): enum {Overlay, Primary, Cursor} = Primary
-    │       ├───"FB_ID" (atomic): object framebuffer = 290
-    │       │   ├───Object ID: 290
+    │       ├───"FB_ID" (atomic): object framebuffer = 289
+    │       │   ├───Object ID: 289
     │       │   ├───Size: 1920x1080
     │       │   ├───Pitch: 7680
     │       │   ├───Bits per pixel: 32
jrmcpeek commented 9 months ago

Note: Since it sounds like the touchscreen issue I mentioned may already be addressed in 22/candidate, I've renamed this issue to focus on the screen failing to recover after a power cycle.

Once we get to the bottom of this, I should be able to confirm whether the touchscreen scenario is resolved.


For a different look, I loaded a Classic Live USB based on Ubuntu 22.04.3.

This was used on the same hardware and configuration as the captures running Ubuntu Core.

The display worked as expected when turning it off and back on again.

Here are the logs:

And the differences between the files.

--- classic-startup.txt
+++ classic-off.txt
@@ -395,8 +395,8 @@
     │   │   └───XVYU16161616 (0x38345658)
     │   └───Properties
     │       ├───"type" (immutable): enum {Overlay, Primary, Cursor} = Primary
-    │       ├───"FB_ID" (atomic): object framebuffer = 286
-    │       │   ├───Object ID: 286
+    │       ├───"FB_ID" (atomic): object framebuffer = 288
+    │       │   ├───Object ID: 288
     │       │   ├───Size: 1920x1080
     │       │   ├───Pitch: 7680
     │       │   ├───Bits per pixel: 32
--- classic-off.txt
+++ classic-on.txt
@@ -310,15 +310,15 @@
 │   │   ├───Mode: 1920x1080@60.00 phsync pvsync 
 │   │   └───Properties
 │   │       ├───"ACTIVE" (atomic): range [0, 1] = 1
-│   │       ├───"MODE_ID" (atomic): blob = 291
+│   │       ├───"MODE_ID" (atomic): blob = 290
 │   │       │   └───1920x1080@60.00 phsync pvsync 
 │   │       ├───"OUT_FENCE_PTR" (atomic): range [0, UINT64_MAX] = 0
 │   │       ├───"VRR_ENABLED": range [0, 1] = 0
 │   │       ├───"SCALING_FILTER": enum {Default, Nearest Neighbor} = Default
 │   │       ├───"DEGAMMA_LUT": blob = 0
 │   │       ├───"DEGAMMA_LUT_SIZE" (immutable): range [0, UINT32_MAX] = 128
-│   │       ├───"CTM": blob = 292
-│   │       ├───"GAMMA_LUT": blob = 290
+│   │       ├───"CTM": blob = 286
+│   │       ├───"GAMMA_LUT": blob = 291
 │   │       └───"GAMMA_LUT_SIZE" (immutable): range [0, UINT32_MAX] = 1024
 │   ├───CRTC 1
 │   │   ├───Object ID: 131
@@ -395,8 +395,8 @@
     │   │   └───XVYU16161616 (0x38345658)
     │   └───Properties
     │       ├───"type" (immutable): enum {Overlay, Primary, Cursor} = Primary
-    │       ├───"FB_ID" (atomic): object framebuffer = 288
-    │       │   ├───Object ID: 288
+    │       ├───"FB_ID" (atomic): object framebuffer = 292
+    │       │   ├───Object ID: 292
     │       │   ├───Size: 1920x1080
     │       │   ├───Pitch: 7680
     │       │   ├───Bits per pixel: 32
--- classic-startup.txt
+++ classic-on.txt
@@ -310,15 +310,15 @@
 │   │   ├───Mode: 1920x1080@60.00 phsync pvsync 
 │   │   └───Properties
 │   │       ├───"ACTIVE" (atomic): range [0, 1] = 1
-│   │       ├───"MODE_ID" (atomic): blob = 291
+│   │       ├───"MODE_ID" (atomic): blob = 290
 │   │       │   └───1920x1080@60.00 phsync pvsync 
 │   │       ├───"OUT_FENCE_PTR" (atomic): range [0, UINT64_MAX] = 0
 │   │       ├───"VRR_ENABLED": range [0, 1] = 0
 │   │       ├───"SCALING_FILTER": enum {Default, Nearest Neighbor} = Default
 │   │       ├───"DEGAMMA_LUT": blob = 0
 │   │       ├───"DEGAMMA_LUT_SIZE" (immutable): range [0, UINT32_MAX] = 128
-│   │       ├───"CTM": blob = 292
-│   │       ├───"GAMMA_LUT": blob = 290
+│   │       ├───"CTM": blob = 286
+│   │       ├───"GAMMA_LUT": blob = 291
 │   │       └───"GAMMA_LUT_SIZE" (immutable): range [0, UINT32_MAX] = 1024
 │   ├───CRTC 1
 │   │   ├───Object ID: 131
@@ -395,8 +395,8 @@
     │   │   └───XVYU16161616 (0x38345658)
     │   └───Properties
     │       ├───"type" (immutable): enum {Overlay, Primary, Cursor} = Primary
-    │       ├───"FB_ID" (atomic): object framebuffer = 286
-    │       │   ├───Object ID: 286
+    │       ├───"FB_ID" (atomic): object framebuffer = 292
+    │       │   ├───Object ID: 292
     │       │   ├───Size: 1920x1080
     │       │   ├───Pitch: 7680
     │       │   ├───Bits per pixel: 32
Saviq commented 9 months ago

Hi @jrmcpeek, the "not working" results show that something is wrong on a lower level than Frame. The display doesn't get configured properly between the hardware and the kernel.

Not sure what'd be different between the Core deploy and Classic. Is the kernel the same?…

@RAOF any ideas what else could be the culprit?

jrmcpeek commented 9 months ago

@Saviq

Is the kernel the same?…

Ubuntu Core:

pc-kernel            5.15.0-89.99.2          1517   22/stable      canonical✓  kernel
Linux machine-1234567890 5.15.0-89-generic #99-Ubuntu SMP Mon Oct 30 20:42:41 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Live USB:

Linux ubuntu 6.2.0-26 generic #26-22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 16:27:29 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
jrmcpeek commented 9 months ago

To see if the kernel makes a difference, I updated the testing device with Ubuntu Core.

Now using:

pc-kernel            6.2.0.37.38~22.04.15    1516   22-hwe/stable  canonical✓  kernel
Linux machine-1234567890 6.2.0-37-generic #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov  2 18:01:13 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

This appears to match the base version that was in use for the Live USB environment.

With this kernel, the monitor powers off and powers on as one would expect, the same as the Classic experience.

This doesn't explain why the device in the field works even with the older kernel and the same hardware, but getting time to remotely collect those logs has proven a challenge.

With that said, with the kernel at 22-hwe/stable, I confirmed:

So, this issue is indeed resolved by the fixes mentioned by Alan.

Is there a target timeline for the promotion of 22/candidate to 22/stable for the mir2.16 updates?


For the kernel version, I think we're okay changing the default configuration for the devices from 22/stable to 22-hwe/stable.

Is there any value at this point trying to get logging information from the device that is working?

AlanGriffiths commented 9 months ago

Is there a target timeline for the promotion of 22/candidate to 22/stable for the mir2.16 updates?

Here's the announcement: https://forum.snapcraft.io/t/call-for-testing-ubuntu-frame-mir-kiosk/37963

We'll announce the promotion there. (Given the holiday season, "we aim to promote to stable in two weeks" may well become early January)