raspberrypi / firmware

This repository contains pre-compiled binaries of the current Raspberry Pi kernel and modules, userspace libraries, and bootloader/GPU firmware.
5.15k stars 1.68k forks source link

RPi4B/Kodi goes to black screen with vc4-kms-v3d after commit 19272ccd69049aaf42c78a235a0bf37dbabd5ea7 #1648

Open mkreisl opened 2 years ago

mkreisl commented 2 years ago

This commit leads to a black screen (TV says no signal) after Kodi is started. This occurs with all tested kernel versions (5.10.60 - 5.10.75 and 5.14.14) and with both Kodi 19 and Kodi 20.

In dmesg the messages appear:

[  113.892587] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out
[  113.892713] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] commit wait timed out
[  124.128612] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out
[  124.128920] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:91:plane-6] commit wait timed out
[  134.368625] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out
[  134.368995] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit
[  144.608695] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] flip_done timed out
[  154.848659] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out
[  154.849120] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] commit wait timed out
[  165.088657] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out
[  165.089117] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:91:plane-6] commit wait timed out
[  175.328649] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out
[  175.329113] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit
[  185.568601] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] flip_done timed out
[  188.481021] init: passwd-change main process (5796) terminated with status 1
[  190.432635] cec-vc4: message ff 84 20 00 06 timed out
[  192.736595] cec-vc4: message ff 87 00 15 82 timed out
[  195.040635] cec-vc4: message f0 timed out
[  197.344605] cec-vc4: message f0 timed out
[  199.136617] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out
[  199.136926] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] commit wait timed out
[  199.648658] cec-vc4: message f0 timed out
[  201.952655] cec-vc4: message f0 timed out
[  204.256584] cec-vc4: message 11 timed out
[  206.560662] cec-vc4: message 11 timed out
[  208.864603] cec-vc4: message 11 timed out
[  209.376644] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out
[  209.376953] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:91:plane-6] commit wait timed out
[  211.168584] cec-vc4: message 11 timed out
[  213.472618] cec-vc4: message 22 timed out
[  215.776619] cec-vc4: message 22 timed out
[  218.080648] cec-vc4: message 99 timed out
[  219.616604] [drm:drm_crtc_commit_wait [drm]] *ERROR* flip_done timed out
[  219.616874] vc4-drm gpu: [drm] *ERROR* Timed out waiting for commit
[  220.384652] cec-vc4: message 99 timed out
[  222.688572] cec-vc4: message ff 84 20 00 01 timed out
[  224.992583] cec-vc4: message ff 87 00 15 82 timed out
[  227.296649] cec-vc4: message 10 timed out
[  229.600630] cec-vc4: message 10 timed out
[  229.856628] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] flip_done timed out
[  231.904602] cec-vc4: message 10 timed out
[  234.208649] cec-vc4: message 10 timed out

If I use vc4-fkms-v3d (which I have always used so far) this problem no longer occurs

This means the firmware files that are present in /boot, not the kernels itself

Until one commit before the one above, the problem does not occur, so it must definitely be the commit arm_loader: Add message to release firmware framebuffer 19272ccd69049aaf42c78a235a0bf37dbabd5ea7

My /boot/config is

initramfs initramfs.gz  followkernel
gpu_mem=224
[pi4]
gpu_mem_1024=70
[all]
initial_turbo=1
hdmi_force_hotplug=0
hdmi_ignore_hotplug=0
hdmi_ignore_cec_init=1
hdmi_ignore_cec=0
disable_overscan=1
disable_splash=1
max_usb_current=1
dtparam=audio=on
dtoverlay=gpio-ir
[pi4]
dtoverlay=vc4-kms-v3d,cma-384
dtoverlay=rpivid-v4l2
max_framebuffers=2
popcornmix commented 2 years ago

That commit did have a later fix:

1e494f150046bd04331ab0b19e4e5322eb7b7773 firmware: arm_loader: Allow hvs interrupt during SET_NOTIFY_DISPLAY_DONE

Does the problem still occur with that? If it does can you post your edid, and I'll see it I can reproduce.

mkreisl commented 2 years ago

How much later is the fix? Of course I also tried it initially about 2-3 days ago with the latest version with the same error A checkout from a commit at the end of july didn't get me anywhere either

Ah, I guess you mean this one: https://github.com/raspberrypi/firmware/commit/5d61ab70ad5dbdc2ac52b5b222ce1e5b6af49662 I will try this, but I don't have much hope

In the meantime i have already read out the edid data:

root@kmxbilr2 ~ # tvservice -d /tmp/edid.dat
Written 256 bytes to /tmp/edid.dat
root@kmxbilr2 ~ # edidparser -4 /tmp/edid.dat
Enabling fuzzy format match...
Parsing /tmp/edid.dat...
HDMI:EDID version 1.3, 1 extensions, screen size 121x68 cm
HDMI:EDID features - videodef 0x80 !standby !suspend active off; colour encoding:RGB444|YCbCr422; sRGB is not default colourspace; preferred format is native; does not support GTF
HDMI:EDID found monitor range descriptor tag 0xfd
HDMI:EDID monitor range offsets: V min=0, V max=0, H min=0, H max=0
HDMI:EDID monitor range: vertical is 49-76 Hz, horizontal is 15-81 kHz, max pixel clock is 80 MHz
HDMI:EDID monitor range does not support GTF
HDMI:EDID found monitor name descriptor tag 0xfc
HDMI:EDID monitor name is GRUNDIG_WUXGA
HDMI:EDID found preferred CEA detail timing format: 1920x1080p @ 50 Hz (31)
HDMI:EDID found CEA detail timing format: 1920x1080p @ 60 Hz (16)
HDMI:EDID established timing I/II bytes are 27 CF 00
HDMI:EDID found DMT format: code 4, 640x480p @ 60 Hz in established timing I/II
HDMI:EDID found DMT format: code 6, 640x480p @ 75 Hz in established timing I/II
HDMI:EDID found DMT format: code 8, 800x600p @ 56 Hz in established timing I/II
HDMI:EDID found DMT format: code 9, 800x600p @ 60 Hz in established timing I/II
HDMI:EDID found DMT format: code 10, 800x600p @ 72 Hz in established timing I/II
HDMI:EDID found DMT format: code 11, 800x600p @ 75 Hz in established timing I/II
HDMI:EDID found DMT format: code 16, 1024x768p @ 60 Hz in established timing I/II
HDMI:EDID found DMT format: code 17, 1024x768p @ 70 Hz in established timing I/II
HDMI:EDID found DMT format: code 18, 1024x768p @ 75 Hz in established timing I/II
HDMI:EDID found DMT format: code 36, 1280x1024p @ 75 Hz in established timing I/II
HDMI:EDID standard timings block x 8: 0x0101 0101 8140 8180 818F 8B00 0101 A940 
HDMI:EDID found DMT format: code 32, 1280x960p @ 60 Hz (4:3) in standard timing 2
HDMI:EDID found DMT format: code 35, 1280x1024p @ 60 Hz (5:4) in standard timing 3
HDMI:EDID found DMT format: code 36, 1280x1024p @ 75 Hz (5:4) in standard timing 4
HDMI:EDID unknown standard timing 1360x850 @ 60 Hz aspect ratio (16:10)
HDMI:EDID found DMT format: code 51, 1600x1200p @ 60 Hz (4:3) in standard timing 7
HDMI:EDID parsing v3 CEA extension 0
HDMI:EDID monitor support - underscan IT formats:no, basic audio:yes, yuv444:yes, yuv422:yes, #native DTD:2
HDMI:EDID failed to find a matching detail format for 1360x768p hfp:208 hs:136 hbp:72 vfp:22 vs:5 vbp:3 pixel clock:84 MHz
HDMI:EDID calculated refresh rate is 60 Hz
HDMI:EDID guessing the format to be 1360x768p @60 Hz
HDMI:EDID found DMT detail timing format: 1360x768p @ 60 Hz (39)
HDMI:EDID found DMT detail timing format: 1280x768p @ 60 Hz (23)
HDMI:EDID found DMT detail timing format: 1440x900p @ 60 Hz (47)
HDMI:EDID found DMT detail timing format: 1400x1050p @ 60 Hz (42)
HDMI:EDID found CEA format: code 20, 1920x1080i @ 50Hz 
HDMI:EDID found CEA format: code 5, 1920x1080i @ 60Hz 
HDMI:EDID found CEA format: code 19, 1280x720p @ 50Hz 
HDMI:EDID found CEA format: code 4, 1280x720p @ 60Hz 
HDMI:EDID found CEA format: code 18, 720x576p @ 50Hz 
HDMI:EDID found CEA format: code 3, 720x480p @ 60Hz 
HDMI:EDID found CEA format: code 17, 720x576p @ 50Hz 
HDMI:EDID found CEA format: code 2, 720x480p @ 60Hz 
HDMI:EDID found CEA format: code 22, 1440x576i @ 50Hz 
HDMI:EDID found CEA format: code 7, 1440x480i @ 60Hz 
HDMI:EDID found CEA format: code 21, 1440x576i @ 50Hz 
HDMI:EDID found CEA format: code 6, 1440x480i @ 60Hz 
HDMI:EDID found CEA format: code 1, 640x480p @ 60Hz 
HDMI:EDID found CEA format: code 31, 1920x1080p @ 50Hz (native)
HDMI:EDID found CEA format: code 16, 1920x1080p @ 60Hz (native)
HDMI:EDID found CEA format: code 38, 2880x576p @ 50Hz 
HDMI:EDID found audio format 6 channels AC3, sample rate: 32|44|48 kHz, bitrate: 640 kbps
HDMI:EDID found audio format 2 channels PCM, sample rate: 32|44|48 kHz, sample size: 16|20|24 bits
HDMI:EDID found HDMI VSDB length 14
HDMI:EDID HDMI VSDB has physical address 2.0.0.0
HDMI:EDID HDMI VSDB supports AI:yes, dual link DVI:no
HDMI:EDID HDMI VSDB deep colour support - 48-bit:no 36-bit:no 30-bit:no DC_yuv444:no
HDMI:EDID HDMI VSDB max TMDS clock 225 MHz
HDMI:EDID HDMI VSDB does not support content type
HDMI:EDID HDMI VSDB supports 3D formats
HDMI:EDID HDMI VSDB 3D_Mask: 0x073f
HDMI:EDID HDMI VSDB 3D_Structure_All: 0x0141
HDMI:EDID adding mandatory support for CEA (32) 1920x1080p @ 24Hz
HDMI:EDID VSDB 3D legend:
FP=frame packing, F-Alt=Field Alternative, L-Alt=Line Alternative
SbS-Full=Side by Side Full, Ldep=L Depth, Ldep+Gfx=L Depth + Graphics Depth
TopBot=Top and Bottom, SbS-HH=Side by Side half horizontal
SbS-OLOR=Side by Side odd left odd right, SbS-OLER=Side by Side odd left even right
SbS-ELOR=Side by Side even left odd right, SbS-ELER=Side by Side even left even right
HDMI:EDID CEA (32) 1920x1080p 24Hz 3D supports: FP|TopBot
HDMI:EDID CEA (4) 1280x720p 60Hz 3D supports: FP|TopBot|SbS-HH
HDMI:EDID CEA (5) 1920x1080i 60Hz 3D supports: FP|TopBot|SbS-HH
HDMI:EDID CEA (19) 1280x720p 50Hz 3D supports: FP|TopBot|SbS-HH
HDMI:EDID CEA (20) 1920x1080i 50Hz 3D supports: FP|TopBot|SbS-HH
HDMI:EDID CEA (18) 720x576p 50Hz 3D supports: FP|TopBot|SbS-HH
HDMI:EDID CEA (3) 720x480p 60Hz 3D supports: FP|TopBot|SbS-HH
HDMI:EDID CEA (17) 720x576p 50Hz 3D supports: none
HDMI:EDID CEA (2) 720x480p 60Hz 3D supports: none
HDMI:EDID CEA (22) 1440x576i 50Hz 3D supports: FP|TopBot|SbS-HH
HDMI:EDID CEA (7) 1440x480i 60Hz 3D supports: FP|TopBot|SbS-HH
HDMI:EDID CEA (21) 1440x576i 50Hz 3D supports: FP|TopBot|SbS-HH
HDMI:EDID CEA (6) 1440x480i 60Hz 3D supports: none
HDMI:EDID CEA (1) 640x480p 60Hz 3D supports: none
HDMI:EDID CEA (31) 1920x1080p 50Hz 3D supports: none
HDMI:EDID CEA (16) 1920x1080p 60Hz 3D supports: none
HDMI:EDID CEA (38) 2880x576p 50Hz 3D supports: none
HDMI:EDID filtering formats with pixel clock unlimited MHz or h. blanking unlimited
HDMI:EDID best score mode initialised to CEA (1) 640x480p @ 60 Hz with pixel clock -1225683547 MHz (score 25)
HDMI:EDID best score mode is now CEA (1) 640x480p @ 60 Hz with pixel clock 25 MHz (score 61864)
HDMI:EDID best score mode is now CEA (2) 720x480p @ 60 Hz with pixel clock 27 MHz (score 66472)
HDMI:EDID CEA mode (3) 720x480p @ 60 Hz with pixel clock 27 MHz has a score of 66472
HDMI:EDID best score mode is now CEA (4) 1280x720p @ 60 Hz with pixel clock 74 MHz (score 135592)
HDMI:EDID DMT mode (4) 640x480p @ 60 Hz with pixel clock 25 MHz has a score of 18432
HDMI:EDID best score mode is now CEA (5) 1920x1080i @ 60 Hz with pixel clock 74 MHz (score 149416)
HDMI:EDID CEA mode (6) 1440x480i @ 60 Hz with pixel clock 27 MHz has a score of 45736
HDMI:EDID DMT mode (6) 640x480p @ 75 Hz with pixel clock 31 MHz has a score of 5760
HDMI:EDID CEA mode (7) 1440x480i @ 60 Hz with pixel clock 27 MHz has a score of 45736
HDMI:EDID DMT mode (8) 800x600p @ 56 Hz with pixel clock 36 MHz has a score of 26880
HDMI:EDID DMT mode (9) 800x600p @ 60 Hz with pixel clock 40 MHz has a score of 28800
HDMI:EDID DMT mode (10) 800x600p @ 72 Hz with pixel clock 50 MHz has a score of 8640
HDMI:EDID DMT mode (11) 800x600p @ 75 Hz with pixel clock 49 MHz has a score of 9000
HDMI:EDID best score mode is now CEA (16) 1920x1080p @ 60 Hz with pixel clock 148 MHz (score 398248)
HDMI:EDID DMT mode (16) 1024x768p @ 60 Hz with pixel clock 65 MHz has a score of 47185
HDMI:EDID CEA mode (17) 720x576p @ 50 Hz with pixel clock 27 MHz has a score of 66472
HDMI:EDID DMT mode (17) 1024x768p @ 70 Hz with pixel clock 75 MHz has a score of 13762
HDMI:EDID CEA mode (18) 720x576p @ 50 Hz with pixel clock 27 MHz has a score of 66472
HDMI:EDID DMT mode (18) 1024x768p @ 75 Hz with pixel clock 78 MHz has a score of 14745
HDMI:EDID CEA mode (19) 1280x720p @ 50 Hz with pixel clock 74 MHz has a score of 117160
HDMI:EDID CEA mode (20) 1920x1080i @ 50 Hz with pixel clock 74 MHz has a score of 128680
HDMI:EDID CEA mode (21) 1440x576i @ 50 Hz with pixel clock 27 MHz has a score of 45736
HDMI:EDID CEA mode (22) 1440x576i @ 50 Hz with pixel clock 27 MHz has a score of 45736
HDMI:EDID DMT mode (23) 1280x768p @ 60 Hz with pixel clock 79 MHz has a score of 58982
HDMI:EDID best score mode is now CEA (31) 1920x1080p @ 50 Hz with pixel clock 148 MHz (score 5336040)
HDMI:EDID CEA mode (32) 1920x1080p @ 24 Hz with pixel clock 74 MHz has a score of 124532
HDMI:EDID DMT mode (32) 1280x960p @ 60 Hz with pixel clock 108 MHz has a score of 98728
HDMI:EDID DMT mode (35) 1280x1024p @ 60 Hz with pixel clock 108 MHz has a score of 103643
HDMI:EDID DMT mode (36) 1280x1024p @ 75 Hz with pixel clock 135 MHz has a score of 49576
HDMI:EDID CEA mode (38) 2880x576p @ 50 Hz with pixel clock 108 MHz has a score of 66472
HDMI:EDID DMT mode (39) 1360x768p @ 60 Hz with pixel clock 85 MHz has a score of 62668
HDMI:EDID DMT mode (42) 1400x1050p @ 60 Hz with pixel clock 121 MHz has a score of 88200
HDMI:EDID DMT mode (47) 1440x900p @ 60 Hz with pixel clock 106 MHz has a score of 77760
HDMI:EDID DMT mode (51) 1600x1200p @ 60 Hz with pixel clock 162 MHz has a score of 140200
HDMI0:EDID preferred mode remained as CEA (31) 1920x1080p @ 50 Hz with pixel clock 148 MHz
HDMI:EDID has HDMI support and audio support
edidparser exited with code 0
root@kmxbilr2 ~ # tvservice -s
state 0x120009 [HDMI CEA (31) RGB lim 16:9], 1920x1080 @ 50.00Hz, progressive
root@kmxbilr2 ~ # 

HTH

mkreisl commented 2 years ago

Here is again the complete dmesg output with current firmware from today. The problem seems to start already when initializing the audio device: http://sprunge.us/hQ6VTS

root@kmxbilr2 /tmp # vcgencmd version
Nov  8 2021 18:47:30 
Copyright (c) 2012 Broadcom
version 4f73dcaefcfd5b20317e44a81d10e9d74fd3dffe (clean) (release) (start)
root@kmxbilr2 /tmp # 

The firmware that still works looks like this: http://sprunge.us/R86fCs, but the kernel warning irritates me a little bit

However I see that the audio error messages are there too, but only this one

popcornmix commented 2 years ago

I'd like the raw edid. e.g. paste output of:

base64 /sys/devices/platform/gpu/drm/card1/card1-HDMI-A-1/edid
mkreisl commented 2 years ago

No problem

AP///////wAeVUhEAQAAACUWAQOAeUR4KuxroFZHnCUQSEonzwABAQEBgUCBgIGPiwABAalAAjqA
0HI4LUAQLEWAoFoAAAAeAjqAGHE4LUBYLEUAoFoAAAAeAAAA/QAxTA9RCAAKICAgICAgAAAA/ABH
UlVORElHIFdVWEdBAcMCAytyUBQFEwQSAxECFgcVBgGfkCYmFQdQCQcHbgMMACAAgC0g0AQBQQc/
GyFQoFEAHjDQiGUEoFoAAAAcDh8AgFEAHjDAgEcEoFoAAAAcaCmg0FGEIjDomJYEoJAAAAAcjy94
0FEaJ0DokAQIoJAAAAAcAAAAAAAAAAAAAAAA9A==
gradientskier commented 2 years ago

Can this be related?

https://beta.raspberrypi.org/forums/viewtopic.php?p=1938658#p1938658

pelwell commented 2 years ago

What effect does changing dtoverlay=vc4-kms-v3d to dtoverlay=vc4-fkms-v3d have?

gradientskier commented 2 years ago

What effect does changing dtoverlay=vc4-kms-v3d to dtoverlay=vc4-fkms-v3d have?

It works for me! Is this the official solution, or there will be an update fixing this issue?

popcornmix commented 2 years ago

We'd like to fix it but so far haven't reproduced. @mkreisl using your edid and config.txt (excluding the initramfs part) and I don't have the issue with latest firmware/kernel. I get 1920x1080@50.00 as the default mode with that edid. Can you post /proc/cmdline?

mkreisl commented 2 years ago

@popcornmix I tried to locate the problem further yesterday and found what I was looking for. I still use the old commits in the Kodi sources which use the PI:HDMI audio device. and this initialize now fails. But this is already heavy that this breaks the hdmi output completely, so would still rate it as a bug in the firmware

This is my cmdline coherent_pool=1M 8250.nr_uarts=0 snd_bcm2835.enable_compat_alsa=0 snd_bcm2835.enable_hdmi=1 video=HDMI-A-1:1920x1080M@50 smsc95xx.macaddr=DC:A6:32:09:2D:5F vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000 telnet zswap.enabled=1 zswap.compressor=lz4 console=tty1 partswap cnet=bond0 root=iSCSI=iqn.2017-12.com.kmhome:kmxbilr2,192.168.1.6:3260,UUID=29e1091f-1092-4787-9748-271d9a670f41 rootflags=subvol=root/@,autodefrag,compress=none rootfstype=btrfs rootwait logo.nologo quiet noswap loglevel=0 startevent=mountall selinux=0 splash nohdparm net.ifnames=0 biosdevname=0 --startup-event mountall

popcornmix commented 2 years ago

Using PI:HDMI (i.e. firmware driving hdmi audio) together with vc4-kms-v3d (i.e. arm driving hdmi hardware) is not a supported configuration. Both ends can access hdmi registers without knowledge of what the other side is doing, and crashes are not unexpected.

We should perhaps try harder to reject use of the firmware hdmi audio driver when kms is enabled.

mkreisl commented 2 years ago

Using PI:HDMI (i.e. firmware driving hdmi audio) together with vc4-kms-v3d (i.e. arm driving hdmi hardware) is not a supported configuration. Both ends can access hdmi registers without knowledge of what the other side is doing, and crashes are not unexpected.

We should perhaps try harder to reject use of the firmware hdmi audio driver when kms is enabled.

I know that and I have also read several posts about how this should be prevented. But unfortunately that does not seem to work properly yet

Brunnis commented 2 years ago

Maybe this issue is related to the problems I am having: https://forums.raspberrypi.com/viewtopic.php?p=1944172#p1943655

I am seeing this on Lakka, which is based on Libre-ELEC. I did not have this issue on 5.10.39, but had it sporadically (maybe 40% of boots) on 5.10.63 and 100% of the time on 5.15.0. So far, I've only seen it happen on my rev 1.4 Pi 4 and only on my 4K TV (LG OLED65CX). Changing from KMS to FKMS makes the issue disappear. See the linked post for full description and log printouts.

For me, reproducing it is as simple as putting the image below (Lakka development image with kernel 5.15.0) on an SD card and booting it on a Pi 4 4GB rev 1.4 board on the LG OLED65CX.

https://nightly.builds.lakka.tv/members/vudiq/Lakka-LE-master/RPi4.aarch64/Lakka-RPi4.aarch64-LE-master-devel-20211121222554-02470fc.img.gz

Brunnis commented 2 years ago

Sorry for slightly misleading information earlier. The sporadic nature of this issue and my lack of time for testing led me to make some incorrect conclusions. The rev 1.1 and the rev 1.4 boards seem about equally affected. Both have major issues with 4K when testing with the 5.10.63 and 5.15.0 kernels on Lakka. 5.10.39 has no issues on any of the boards, whether running 4K or 1080P. Below are the full test results. Each config was booted five times and the result recorded. Each boot was a cold boot.

image

popcornmix commented 2 years ago

@Brunnis are you able to narrow it down further? Are you purely changing the kernel between tests (and not firmware or other parts of system)?

Brunnis commented 2 years ago

@popcornmix Actually, I am just switching between the Lakka images, so not changing the system from defaults in any way so far. Can you suggest any steps/commands that would help in narrowing it down?

popcornmix commented 2 years ago

Easiest to reproduce with RPiOS bullseye (then we'll know what you are running and have a better chance of reproducing). If it's a kernel issue you should see the same issue on RPiOS.

From there you can identify the exact update which caused this. See: https://github.com/raspberrypi/rpi-firmware/commits/master

If you click on each commit the end of the url contains a git hash. Run sudo rpi-update <hash> to revert back to that version. Report the first version with the error.

Brunnis commented 2 years ago

I've ran a bunch of reboots using Bullseye lite 32-bit, but cannot reproduce the issue there. Lakka is 64-bit, but both use the same kernel (5.10.63). The output of vcgencmd version is:

Lakka 3.6 64-bit: Oct 29 2021 10:48:24 Copyright (c) 2012 Broadcom version b8a114e5a9877e91ca8f26d1a5ce904b2ad3cf13 (clean) (release) (start_x)

Bullseye 32-bit lite: Oct 29 2021 10:47:33 Copyright (c) 2012 Broadcom version b8a114e5a9877e91ca8f26d1a5ce904b2ad3cf13 (clean) (release) (start)

Last week I tried Bullseye 64-bit desktop and that also seemed to work fine. Don't really know how to proceed, to be honest.

popcornmix commented 2 years ago

It seems lakka is using start_x. You can add start_x=1 to bullseye config.txt just in case that makes a difference (I wouldn't expect it to).

Are they both using dtoverlay=vc4-kms-v3d driver?

But if the issue doesn't occur with RPiOS bullseye, then there's not much we can do. We don't really know what the differences are with lakka.

pelwell commented 2 years ago

It's probably worth checking that Lakka includes overlay_map.dtb in /boot (or wherever they mount the boot partition). You can also see if sudo vcdbg log msg shows any errors from the firmware (if the utility is installed).

Brunnis commented 2 years ago

@popcornmix start_x=1 made no difference. Both are indeed using vc4-kms-v3d.

@pelwell Lakka does not include overlay_map.dtb. Is that an issue?

pelwell commented 2 years ago

It would mean the usual config.txt file is loading the Pi 0-3 version of the kms overlay. Try changing the line to dtoverlay=vc4-kms-v3d-pi4.

pelwell commented 2 years ago

Ah - I should have said to check in the overlays subdirectory.

Brunnis commented 2 years ago

Ahh, Lakka is actually using dtoverlay=vc4-kms-v3d-pi4 by default. Sorry for overlooking that earlier.

Regarding the overlays subdirectory, that was actually where I looked, since I figured that would be the correct place. 😊

I'll let you know if I get anywhere with further testing.

Brunnis commented 2 years ago

I found a successful workaround for the no HDMI issue that at least works on Lakka 3.6 with the 5.10.63 kernel: Add "video=HDMI-A-1:1920x1080M@60" to cmdline.txt. So, it seems something sporadically fails (more than 50% of boots) when trying to use 4K 30Hz. Don't know yet if this particularly affects the LG OLED CX, as I don't have any more 4K TVs around. I actually belive I tested this workaround early on in my testing of kernel 5.15.0 and it didn't help there, but I need to confirm that.

Doubtful if this is related to the original issue described here, but I thought I'd let you know of this slight progress.

david-barbion commented 2 years ago

I also have the black screen problem. However, when using a plain micro hdmi to hdmi cable all is fine. When using a micro hdmi adapter and a standard hdmi cable, it fails with a black screen.

Tested on Raspbian bullseye and Recalbox 8.0.0 (both of them use kms by default)

Going back to fkms makes both cables work, so this does not look like a cable problem.

popcornmix commented 2 years ago

However, when using a plain micro hdmi to hdmi cable all is fine. When using a micro hdmi adapter and a standard hdmi cable, it fails with a black screen.

Sounds a bit like the adapter is breaking something (possibly reading the edid). In each case can you show output of one of:

base64 /sys/devices/platform/gpu/drm/card0/card0-HDMI-A-1/edid
base64 /sys/devices/platform/gpu/drm/card1/card1-HDMI-A-1/edid

(one will report missing file, the other should show something).

Do this when using the kms driver, from ssh if you have blank screen.

david-barbion commented 2 years ago

With the adapter plugged-in, edid is empty

# cat /sys/devices/platform/gpu/drm/card1/card1-HDMI-A-1/edid
# cat /sys/devices/platform/gpu/drm/card1/card1-HDMI-A-2/edid
# 

(card-0-HDMI-A-1 does not exist)

vcdbg log msg reports errors when reading edid. I've copied the log output here https://pastebin.com/gsEZUues

The issue appeared between kernel 5.10.13 and 5.10.79.

Also, at power-up, a signal is sent to the TV until vc4 (drm) is loaded . This is even more visible with raspbian where one can see the linux system loading and at some point (when vc4 is loaded, I think), the screen goes black and no signal is sent anymore.

6by9 commented 2 years ago

I'd guess that the hotplug line isn't following the HDMI spec. The firmware tries reading the EDID regardless of the hotplug status. The kernel doesn't as it follows the HDMI spec more closely, and if hotplug isn't asserted then it won't read the EDID.

digitalLumberjack commented 2 years ago

Hello @david-barbion @popcornmix o/

We still have many users with the issue on @recalbox on the latest kernel + firmware. Could we help anyway making other tests for you ?

popcornmix commented 2 years ago

If you can't read the edid using a specific adapter then the adapter is faulty (most likely no not connecting the hotplug detect line). You can try adding video=:D to end of cmdline.txt which may override it.

digitalLumberjack commented 2 years ago

Is there something that could explain that it works flawlessly on old kernels with the same adapter ?

popcornmix commented 2 years ago

There was a bug in the kernel driver where we would read the edid without hotplug being asserted which is against spec (and caused issues with re-enabling scrambling after switching hdmi input).

david-barbion commented 2 years ago

Finally, the video option solved this issue. Thank you @popcornmix.

ModMike commented 2 years ago

So what was the solution for the Argon case?

digitalLumberjack commented 2 years ago

@popcornmix we just released the new Recalbox 8.0.1 with the video=HDMI-A-1:D video=HDMI-A-2:D in cmldline.txt that seems to fix the issue of the black screen where people had black screens.

video=:D did not fix the issue so we wanted to try the working one.

But it cause other issues, as it seems to avoid the kernel to take any arguments of config.txt (hdmi_mode in this case)

Is there a documentation or source code anywhere that could let us understand how this video=XXX is parsed and used ?

popcornmix commented 2 years ago

video=:D should not be applied universally as a default. It is generally only needed with faulty hardware (e.g. a hdmi cable that doesn't connect the hotplug line, or an Argon case which are very cause a lot of unreliability issues for hdmi).

Info on the video= setting here: https://www.kernel.org/doc/html/v5.10/fb/modedb.html

digitalLumberjack commented 2 years ago

Thank you @popcornmix for the information. I'm sorry but I must admit we don't know what to think about this issue.

100% of the hdmi worked well. Since the bug fix we talked about, many people had a black screen.

Does that mean that a certain percentage of hdmi cables/tv were working by chance ? Cannot it be considered as a feature and not a bug then ?

I may be wrong but I think several distributions are touched by this bug, and right now we are even thinking about rolling back to the old kernel and firmware version as we don't know what to do.

Please help me not doing this πŸ˜…

ModMike commented 2 years ago

@digitalLumberjack I tried it and it doesn't work. I did have a bridged solder point on HDMI which I fixed and validated with my my multimeter.

Am I supposed to put the whole statement video=HDMI-A-1:D video=HDMI-A-2:D in the cmdline.txt or just one or the other as in video=HDMI-A-1:D?

ModMike commented 2 years ago

@popcornmix I understand how much disdain you have for the argon and others like it but I do "love" mine, specifically for the ssd functionality. Software work arounds are par for the course in development.

digitalLumberjack commented 2 years ago

@digitalLumberjack I tried it and it doesn't work. I did have a bridged solder point on HDMI which I fixed and validated with my my multimeter.

Am I supposed to put the whole statement video=HDMI-A-1:D video=HDMI-A-2:D in the cmdline.txt or just one or the other?

We put that in cmdline.txt at the end of the line. Don't know if it fixes anything for the argon though.

ModMike commented 2 years ago

@digitalLumberjack Thanks, I appreciate you trying to help. Will try outside of case and report back.

pelwell commented 2 years ago

"disdain" is a strong word that doesn't apply here - saying that they "cause a lot of unreliability issues for hdmi" is just a statement of fact.

ModMike commented 2 years ago

Test results:

  1. Adding video=HDMI-A-1:D video=HDMI-A-2:D to cmdline.txt did not work for Argon
  2. As suggested by @popcornmix, system works fine when removed from Argon case
  3. Leaving in video=HDMI-A-1:D video=HDMI-A-2:D in cmdline.txt when out of Argon case causes washed out video and seems to affect audio interface choices

Notes:

I followed the trace and it is connected. There are no components, it's a straight pass through, just like my my micro HDMI adapter. What could be the issue?

bkg2k commented 2 years ago

"disdain" is a strong word that doesn't apply here - saying that they "cause a lot of unreliability issues for hdmi" is just a statement of fact.

None of our users reported faultly Argon case until then. Maybe they were all luky. Unfortunately, now, there are plenty of users reporting blackscreen. The situation is even worse with the video=...:D as @digitalLumberjack said.

The situation for end users is a lot worse since this "fix". Our users are not technical people at all. They are mostly players. Lambda users don't care or even know if they have "fautly hardware". The only things they see is "It's no longer working". Asking lamba users to add some cmdline option only when it does'nt work is not an option. The firmware should work out of the box, as before, for as many hardware as possible, faultly or not.

pelwell commented 2 years ago

@popcornmix has spent years fixing the corner cases in the old firmware-based HDMI driver. It will take a while for the remaining issues with the new driver to be resolved, but they will.

ModMike commented 2 years ago

I agree with @bkg2k 's sentiment but also understand @popcornmix 's desire to do things the right way, but, as I've said before, not everything is done the "right" way and this change has trashed a lot of systems. Electronics are littered with corrections and or errors that we usually find a way around.

I hope @popcornmix does find a way to fix things so Libreelec is compatible with more devices. I think the amlogic et al. boxes are getting expensive and hard to find, which is why I switched to the RPI 2 years ago.

This thread is getting dangerously close to a flame war so I will let @popcornmix get on with it and no longer comment. Furthermore I am offering a $40 donation bounty when this issue is fixed. I know it's not much but I think we should ALL encourage the Libreelec team. I would encourage others in this thread to add to the bounty, ESPECIALLY if you are a vendor. Fair is fair.

ModMike commented 2 years ago

Asking lamba users to add some cmdline option only when it does'nt work is not an option.

Did I miss something? What fix is this?

ModMike commented 2 years ago

What effect does changing dtoverlay=vc4-kms-v3d to dtoverlay=vc4-fkms-v3d have?

It works for me! Is this the official solution, or there will be an update fixing this issue?

So changing this in the flash/distro.config gives me video on the Argon but I lose audio outputs. Is there a way to fix it? I tried both HDMI ports.

This is what I have now in distro.cfg: arm_64bit=1 kernel=kernel.img dtoverlay=vc4-fkms-v3d,cma-512 dtoverlay=rpivid-v4l2 dtoverlay= disable_overscan=1 disable_fw_kms_setup=1

and cmdline.txt:

boot=UUID=2801-4448 disk=UUID=6f37a636-034d-4f7d-9eeb-a1453c0cf6ba quiet snd-bcm2835.enable_compat_alsa=1

bkg2k commented 2 years ago

This thread is getting dangerously close to a flame war so I will let @popcornmix get on with it and no longer comment. Furthermore I am offering a $40 donation bounty when this issue is fixed. I know it's not much but I think we should ALL encourage the Libreelec team. I would encourage others in this thread to add to the bounty, ESPECIALLY if you are a vendor. Fair is fair.

No flame war here. We're in a serious thread. The original fix seems legit and I don't blame @popcornmix or anyone else for it. I'm just here to alert that the fix seems to have severe side-effects that affect many people. Most likely more people than those affected by the original bug. So for me, the balance pros/cons is on the wrong side, and maybe a rollback should be considered until further investigations.