raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
10.8k stars 4.84k forks source link

broken sudo vclog -m output #6178

Open qrp73 opened 2 weeks ago

qrp73 commented 2 weeks ago

Describe the bug

The output of the command sudo vclog -m very often is broken. See log for example

Steps to reproduce the behaviour

1) Open console 2) run command sudo vclog -m many times

Expected result: all attempt should give correct output

Actual result: in most cases the output is broken (see log for example)

Device (s)

Raspberry Pi 4 Mod. B

System

$ cat /etc/rpi-issue
Raspberry Pi reference 2023-09-22
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 40f37458ae7cadea1aec913ae10b5e7008ebce0a, stage4

$ vcgencmd version
Apr 26 2024 18:18:11 
Copyright (c) 2012 Broadcom
version f1a80576d5322dde60a2e7e9e3f4b56fd5fbe18d (clean) (release) (start)

$ uname -a
Linux raspi 6.6.30-v8+ #1 SMP PREEMPT Wed May 15 15:36:33 UTC 2024 aarch64 GNU/Linux

Logs

$ sudo vclog -m
1830825.008: ailed+�
1768383.858: de=7) failedn���
1869488.235: d)i鴡
1601463.137: (space=3291328, in_low_region=0, mode=7) failedr���
1767990.816: 
691486.053: .֡
1763716.152: gion=0, mode=7) failed�Uߡ
544108.320:  (buffer header callback not defined)��
809332.335: ) failed�*�
1701209.717:  callback not defined)����
000000.081: compact_internal(space=3291328, in_low_region=0, mode=7) failed��
678191.470: 91328, in_low_region=0, mode=7) failed8��
1634561.280: port_event_send: event lost on port 1,0 (buffer header callback not defined)ʣ�
544502.639: 1,0 (buffer header callback not defined)r!�

$ sudo vclog -m
1830825.008: ailed(�i�
1768383.858: de=7) failed�4�
1869488.235: d)�x�
1601463.137: (space=3291328, in_low_region=0, mode=7) failed(1��
1767990.816: 
pi@raspi:~ $ sudo vclog -m
1830825.008: ailed�훢
1768383.858: de=7) failed
                         �!�
1869488.235: d)��%�
1601463.137: (space=3291328, in_low_region=0, mode=7) failed��.�
1767990.816: 

$ sudo vclog -m
1830825.008: ailed*���
1768383.858: de=7) failed��?�
1869488.235: d)��C�
1601463.137: (space=3291328, in_low_region=0, mode=7) failed��L�
1767990.816: 
539768.882: egion=0, mode=7) failed��f�
926770.532: �pp�
1801675.106: ined)1�x�
1828716.800: l_port_event_send: event lost on port 1,0 (buffer header callback not defined)���
1700754.546: d: event lost on port 1,0 (buffer header callback not defined)���
1932029.025: 1328, in_low_region=0, mode=7) failed@+��
��01080.941: edf
1667330.157: al(space=3291328, in_low_region=0, mode=7) failedn�
1953392.991: ace=3291328, in_low_region=0, mode=7) failed�_��
1953722.220:  1,0 (buffer header callback not defined)&ů�

$ sudo vclog -m
1830825.008: ailed��Ӣ
1768383.858: de=7) failedj�Y�
1869488.235: d))�]�
1601463.137: (space=3291328, in_low_region=0, mode=7) failedT�f�
1767990.816: 
1767990.816: 
1701322.866: lback not defined)䛂�
540699.745: t_event_send: event lost on port 1,0 (buffer header callback not defined)����
1702257.952: on port 1,0 (buffer header callback not defined)o���
2672462.180: 
942814.001: _region=0, mode=7) failed����
1601071.392: on=0, mode=7) failed7���
000065.536: mpact_internal(space=3291328, in_low_region=0, mode=7) failed<׹�
1634561.280: port_event_send: event lost on port 1,0 (buffer header callback not defined)S�â

$ sudo vclog -m
2726388.521: alloc_compact_internal(space=3291328, in_low_region=0, mode=7) failed
2726388.563: mmal: mmal_port_event_send: event lost on port 1,0 (buffer header callback not defined)
2726397.969: alloc_compact_internal(space=3291328, in_low_region=0, mode=7) failed
2726398.010: mmal: mmal_port_event_send: event lost on port 1,0 (buffer header callback not defined)
2726398.219: alloc_compact_internal(space=3291328, in_low_region=0, mode=7) failed
2726398.259: mmal: mmal_port_event_send: event lost on port 1,0 (buffer header callback not defined)
2726407.994: alloc_compact_internal(space=3291328, in_low_region=0, mode=7) failed
2726408.035: mmal: mmal_port_event_send: event lost on port 1,0 (buffer header callback not defined)
2726408.291: alloc_compact_internal(space=3291328, in_low_region=0, mode=7) failed
<...>

Additional context

No response

popcornmix commented 2 weeks ago

The messages suggest there have been gpu allocation failures. What are you running prior to getting corrupt output?

Do you ever see the corrupt output soon after a reboot, without running vc side code (like video encode/decode)?

qrp73 commented 2 weeks ago

I running chromium-browser and firefox. Since some update it always full of such messages. It usually happens when playing videos on youtube and telegram.

But this issue about corrupted messages in console output

Here is how you can reproduce these errors: 1) reboot 2) start chromium-browser 3) open page https://t.me/s/wildlifen 4) play video from last message (about red bird) 5) try to play video from previous message (about elephant),it will not be played due to some issue 6) try to play video from previous message (about wolf) 7) start console and execute sudo vclog -m

The messages from the point when chromium-browser was started:

286046.273: video_decode:13:RIL: resolution changing
286078.564: alloc_compact_internal(space=3291328, in_low_region=0, mode=7) failed
286078.609: mmal: mmal_port_event_send: event lost on port 1,0 (buffer header callback not defined)
286078.877: alloc_compact_internal(space=3291328, in_low_region=0, mode=7) failed
286078.908: mmal: mmal_port_event_send: event lost on port 1,0 (buffer header callback not defined)
286096.409: alloc_compact_internal(space=3291328, in_low_region=0, mode=7) failed
286096.437: mmal: mmal_port_event_send: event lost on port 1,0 (buffer header callback not defined)
286117.028: alloc_compact_internal(space=3291328, in_low_region=0, mode=7) failed
286117.107: mmal: mmal_port_event_send: event lost on port 1,0 (buffer header callback not defined)
286136.729: alloc_compact_internal(space=3291328, in_low_region=0, mode=7) failed
286136.828: mmal: mmal_port_event_send: event lost on port 1,0 (buffer header callback not defined)
pelwell commented 2 weeks ago

The nature of this corruption makes me think that either we are reading from the wrong start point or that new messages are overwriting the old ones before we have a chance to read them, making use lose track of the message boundaries.

Looking at the last group of messages above, that's quite a high volume of data, and I suspect it was higher before, making the above scenarios more likely.

qrp73 commented 2 weeks ago

I think there is some issue to read log buffer, because when it shows log properly it shows all messages with no issue, even these which previously shown as broken text

Regarding to these errors in the log, there is definitely something is going wrong, because 30 sec video playback floods the log with a bunch of error messages. So I opened separate issue about it: https://github.com/raspberrypi/linux/issues/6179

pelwell commented 2 weeks ago

I think there is some issue to read log buffer, because when it shows log properly it shows all messages with no issue, even these which previously shown as broken text

Your opinion is noted.

qrp73 commented 2 weeks ago

Is there any chance to fix it?

alessandromrc commented 2 weeks ago

I've just tried on Pi4 watching the Telegram videos and can confirm one doesn't work but that seems just to be related to the video format not being supported... Other than that, I am having no such errors in the log or any corruption.

vclog output:

029715.317: arasan: arasan_emmc_open
029715.487: arasan: arasan_emmc_set_clock C0: 0x00800000 C1: 0x000e0047 emmc: 200000000 actual: 390625 div: 0x00000100 target: 400000 min: 400000 max: 400000 delay: 5
029820.259: arasan: arasan_emmc_set_clock C0: 0x00800000 C1: 0x000e0047 emmc: 200000000 actual: 390625 div: 0x00000100 target: 400000 min: 400000 max: 400000 delay: 5
029820.346: arasan: arasan_emmc_set_clock C0: 0x00800f00 C1: 0x000e0047 emmc: 200000000 actual: 390625 div: 0x00000100 target: 400000 min: 390000 max: 400000 delay: 5
029849.738: arasan: arasan_emmc_set_clock C0: 0x00800f06 C1: 0x000e0207 emmc: 200000000 actual: 50000000 div: 0x00000002 target: 50000000 min: 0 max: 50000000 delay: 1
029856.950: brfs: File read: /mfs/sd/config.txt
029857.687: brfs: File read: 1303 bytes
029881.406: HDMI0:EDID error reading EDID block 0 attempt 0
029882.423: HDMI0:EDID giving up on reading EDID block 0
029894.160: HDMI1:EDID error reading EDID block 0 attempt 0
029895.175: HDMI1:EDID giving up on reading EDID block 0
029901.067: brfs: File read: /mfs/sd/config.txt
030707.231: gpioman: gpioman_get_pin_num: pin DISPLAY_DSI_PORT not defined
030709.404: *** Restart logging
030709.424: brfs: File read: 1303 bytes
030717.255: hdmi: HDMI0:EDID error reading EDID block 0 attempt 0
030718.271: hdmi: HDMI0:EDID giving up on reading EDID block 0
030723.318: hdmi: HDMI0:EDID error reading EDID block 0 attempt 0
030724.334: hdmi: HDMI0:EDID giving up on reading EDID block 0
030724.351: hdmi: HDMI:hdmi_get_state is deprecated, use hdmi_get_display_state instead
030729.392: hdmi: HDMI1:EDID error reading EDID block 0 attempt 0
030730.409: hdmi: HDMI1:EDID giving up on reading EDID block 0
030735.450: hdmi: HDMI1:EDID error reading EDID block 0 attempt 0
030736.466: hdmi: HDMI1:EDID giving up on reading EDID block 0
030736.484: hdmi: HDMI:hdmi_get_state is deprecated, use hdmi_get_display_state instead
030736.499: HDMI0: hdmi_pixel_encoding: 300000000
030736.512: HDMI1: hdmi_pixel_encoding: 300000000
032225.212: brfs: File read: /mfs/sd/initramfs8
032225.226: Loaded 'initramfs8' to 0x0 size 0x10c8f2f
032240.354: initramfs loaded to 0x2df37000 (size 0x10c8f2f)
032249.799: dtb_file 'bcm2711-rpi-4-b.dtb'
032249.818: brfs: File read: 17600303 bytes
032255.717: brfs: File read: /mfs/sd/bcm2711-rpi-4-b.dtb
032255.732: Loaded 'bcm2711-rpi-4-b.dtb' to 0x100 size 0xd845
032269.223: brfs: File read: 55365 bytes
032284.347: brfs: File read: /mfs/sd/overlays/overlay_map.dtb
032307.468: brfs: File read: 5195 bytes
032309.847: brfs: File read: /mfs/sd/config.txt
032309.967: dtparam: audio=on
032315.135: brfs: File read: 1303 bytes
032335.332: brfs: File read: /mfs/sd/overlays/vc4-kms-v3d-pi4.dtbo
032398.492: Loaded overlay 'vc4-kms-v3d'
032398.727: dtparam: audio=on
032549.705: brfs: File read: 3913 bytes
032552.196: brfs: File read: /mfs/sd/overlays/allo-boss-dac-pcm512x-audio.dtbo
032567.628: Loaded overlay 'allo-boss-dac-pcm512x-audio'
032617.409: brfs: File read: 1657 bytes
032619.013: brfs: File read: /mfs/sd/cmdline.txt
032619.068: Read command line from file 'cmdline.txt':
032619.083: 'console=serial0,115200 console=tty1 root=PARTUUID=e421b187-02 rootfstype=ext4 fsck.repair=yes rootwait quiet splash plymouth.ignore-serial-consoles cfg80211.ieee80211_regdom=IT'
032742.974: brfs: File read: 176 bytes
033527.206: brfs: File read: /mfs/sd/kernel8.img
033527.228: Loaded 'kernel8.img' to 0x200000 size 0x8d7b2b
034823.389: Device tree loaded to 0x2df29200 (size 0xddff)
034830.067: gpioman: gpioman_get_pin_num: pin SDCARD_CONTROL_POWER not defined
034938.410: sdram: sdram refresh 1562->3124 (2)

037926.870: vchiq_core: vchiq_init_state: slot_zero = 0xce080000, is_master = 1
2819087.181: video_decode:8:RIL: resolution changing
2819136.151: video_decode:9:RIL: resolution changing
2825204.365: video_decode:10:RIL: resolution changing
2825206.053: mmal: mmal_port_event_send: event lost on port 1,0 (buffer header callback not defined)
2825251.406: video_decode:11:RIL: resolution changing
2828456.931: video_decode:12:RIL: resolution changing
2920142.705: video_decode:13:RIL: resolution changing
2931952.481: video_decode:14:RIL: resolution changing
2932347.546: video_decode:15:RIL: resolution changing
2934183.993: video_decode:16:RIL: resolution changing
2936025.211: video_decode:17:RIL: resolution changing
3050117.911: video_decode:18:RIL: resolution changing

Running on: Linux raspberrypi4 6.6.28+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.28-1+rpt1 (2024-04-22) aarch64 GNU/Linux

qrp73 commented 1 week ago

Other than that, I am having no such errors in the log or any corruption. Running on: Linux raspberrypi4 6.6.28+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.28-1+rpt1 (2024-04-22) aarch64 GNU/Linux

You're using old kernel 6.6.28 which don't support some video codecs and don't have imrovements and fixes for video codec hw acceleration which was done in 6.6.29 and 6.6.30.

If you read carefully my bug report, it is tested on kernel 6.6.30. You can update your old 6.6.28 to 6.6.30 with sudo rpi-update rpi-6.6.y

alessandromrc commented 1 week ago

Other than that, I am having no such errors in the log or any corruption.

Running on: Linux raspberrypi4 6.6.28+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.28-1+rpt1 (2024-04-22) aarch64 GNU/Linux

You're using old kernel 6.6.28 which don't support some video codecs and don't have imrovements and fixes for video codec hw acceleration which was done in 6.6.29 and 6.6.30.

If you read carefully my bug report, it is tested on kernel 6.6.30.

You can update your old 6.6.28 to 6.6.30 with sudo rpi-update rpi-6.6.y

Oh, my bad, I misread the actual Kernel version. I will try later with the newer kernels and see what happens. Thanks!

qrp73 commented 1 week ago

I opened separate issue https://github.com/raspberrypi/linux/issues/6179 about video codec errors. This issue is about corrupted text in vclog output.

alessandromrc commented 1 week ago

vcgencmd version

Can confirm it doesn't happen even on this kernel:

Linux raspberrypi4 6.6.30-v8+ #1763 SMP PREEMPT Mon May 13 17:50:31 BST 2024 aarch64 GNU/Linux

vclog output:

037901.354: vchiq_core: vchiq_init_state: slot_zero = 0xce080000, is_master = 1
147026.672: video_decode:8:RIL: resolution changing
154141.536: video_decode:9:RIL: resolution changing
154145.273: mmal: mmal_port_event_send: event lost on port 1,0 (buffer header callback not defined)
154826.838: video_decode:10:RIL: resolution changing
157482.271: video_decode:11:RIL: resolution changing

Nevermind, it happens to cause error when starting multiple videos at once.

qrp73 commented 1 week ago

I notice that corrupted vclog output happens when there is opened web page with video, probably it happens when there is some calls to codec.

popcornmix commented 1 week ago

We are reading the log buffer without synchronisation(*). If you try to read while a message is being written you may get corruption. Typically log messages on VC side are infrequent so the chance of corruption is low.

But it seem the bug where the browser tries to hw decode too many streams causes many error messages. It's possible once that issue is resolved the corruption issue will be much less common.

(*) adding synchronisation between arm and VC would slow things down significantly so will not be done for a debug feature.

qrp73 commented 1 week ago

well, I agree that adding sync on VC side just to read log is not good, because it will slow down the main work of VC, but maybe there is some way to sync it on arm side with no changes on VC?

Maybe there is a sense to implement some non-blocking concurrency synchronization? It happens when VC adding log record, so in theory it should not affect it's performance during usual work which don't require write log records. Isn't it?