Open randyoo opened 8 years ago
If you get a GPU hang before the "ERROR Failed to allocate from CMA", it might be useful to get a GPU hang dump from it (https://github.com/anholt/vc4-gpu-tools). If it only hangs after the OOM errors, then we probably need to debug memory usage.
It seems to be hanging only after the "failure to allocate" errors. Actually just had another instance where, on a fresh-booted system, with >500MB free RAM, I got similar errors in the kern.log file, just by re-sizing a Terminal window:
Mar 14 20:40:43 pi3 kernel: [ 95.233915] vc4-drm soc:gpu@7e4c0000: failed to allocate buffer with size 1089536
Mar 14 20:40:45 pi3 kernel: [ 97.049983] [drm] Resetting GPU.
Mar 14 20:40:47 pi3 kernel: [ 99.050061] [drm] Resetting GPU.
Mar 14 20:40:49 pi3 kernel: [ 101.050085] [drm] Resetting GPU.
Mar 14 20:40:51 pi3 kernel: [ 103.050109] [drm] Resetting GPU.
Mar 14 20:40:53 pi3 kernel: [ 105.050122] [drm] Resetting GPU.
Mar 14 20:40:55 pi3 kernel: [ 107.058299] [drm] Resetting GPU.
Mar 14 20:41:18 pi3 kernel: [ 130.762654] vc4-drm soc:gpu@7e4c0000: failed to allocate buffer with size 1089536
Mar 14 20:41:18 pi3 kernel: [ 130.865088] vc4-drm soc:gpu@7e4c0000: failed to allocate buffer with size 1056768
Mar 14 20:41:18 pi3 kernel: [ 130.868299] vc4-drm soc:gpu@7e4c0000: failed to allocate buffer with size 1056768
Mar 14 20:41:18 pi3 kernel: [ 130.868357] vc4-drm soc:gpu@7e4c0000: failed to allocate buffer with size 1056768
Mar 14 20:41:18 pi3 kernel: [ 130.932494] [drm:vc4_validate_bin_cl [vc4]] *ERROR* 0x00000000: packet 112 (VC4_PACKET_TILE_BINNING_MODE_CONFIG)
If the "failed to allocate" wasn't followed by someone else complaining about allocation failure, then usually a cache got cleared and we managed to allocate.
Sorry, I shouldn't have left out this detail, but in that previous comment, the last line was followed by a complete system crash, including a line of gibberish in the kernel log.
If there's something I need to do to help debug memory use, let me know. Seems really easy to reproduce--just resizing a Terminal window consistently fills the logs with these kinds of errors, causes >10 second system freezes, sometimes followed by a complete crash.
Greetings, I can reproduce this problem by using non-updated clean image of Raspbian Jessie (2016-05-27, which is installed by NOOBS 1.9.2)
The firefox/firefox-esr software causes the below outputs, while epiphany-browser doesn't. I do not even use CMA within config.txt
root@raspberrypi:~# grep -Ev '^#|^$' /boot/config.txt
disable_overscan=1
framebuffer_width=1920
framebuffer_height=1080
dtparam=audio=on
hdmi_force_hotplug=1
dtoverlay=vc4-kms-v3d
gpu_mem=256
root@raspberrypi:~# vcgencmd get_config int
arm_freq=1200
audio_pwm_mode=1
config_hdmi_boost=5
core_freq=400
desired_osc_freq=0x36ee80
disable_commandline_tags=2
disable_l2cache=1
disable_splash=1
force_eeprom_read=1
force_pwm_open=1
framebuffer_height=1080
framebuffer_ignore_alpha=1
framebuffer_swap=1
framebuffer_width=1920
gpu_freq=300
hdmi_force_cec_address=65535
hdmi_force_hotplug=1
init_uart_clock=0x2dc6c00
lcd_framerate=60
mask_gpu_interrupt0=1024
mask_gpu_interrupt1=26370
over_voltage_avs=0x19f0a
pause_burst_frames=1
program_serial_random=1
sdram_freq=450
second_boot=1
temp_limit=85
root@raspberrypi:~# vcgencmd get_config str
device_tree=-
root@raspberrypi:~#
dmesg - vc4 and drm related boot/startup outputs and kernel commandline
root@raspberrypi:~# dmesg | grep -E 'drm|vc'
[ 0.000000] Kernel command line: 8250.nr_uarts=0 cma=256M@256M dma.dmachans=0x7f35 bcm2708_fb.fbwidth=1920 bcm2708_fb.fbheight=1080 bcm2709.boardrev=0xa02082 bcm2709.serial=0x3747200a smsc95xx.macaddr=B8:27:EB:47:20:0A bcm2708_fb.fbswap=1 bcm2709.uart_clock=48000000 vc_mem.mem_base=0x3dc00000 vc_mem.mem_size=0x3f000000 dwc_otg.lpm_enable=0 console=ttyS0,115200 console=tty1 root=/dev/mmcblk0p7 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait quiet acpi=off
[ 1.273678] vc-cma: Videocore CMA driver
[ 1.273689] vc-cma: vc_cma_base = 0x00000000
[ 1.273699] vc-cma: vc_cma_size = 0x00000000 (0 MiB)
[ 1.273707] vc-cma: vc_cma_initial = 0x00000000 (0 MiB)
[ 1.273931] vc-mem: phys_addr:0x00000000 mem_base=0x3dc00000 mem_size:0x3f000000(1008 MiB)
[ 1.298735] vchiq: vchiq_init_state: slot_zero = 0x90400000, is_master = 0
[ 1.841870] vc-sm: Videocore shared memory driver
[ 1.841884] [vc_sm_connected_init]: start
[ 1.842340] [vc_sm_connected_init]: end - returning 0
[ 5.466545] [drm] Initialized drm 1.1.0 20060810
[ 5.553105] vc4-drm soc:gpu: bound 3f902000.hdmi (ops vc4_hdmi_ops [vc4])
[ 5.558622] vc4-drm soc:gpu: bound 3f206000.pixelvalve (ops vc4_crtc_ops [vc4])
[ 5.558898] vc4-drm soc:gpu: bound 3f207000.pixelvalve (ops vc4_crtc_ops [vc4])
[ 5.559097] vc4-drm soc:gpu: bound 3f807000.pixelvalve (ops vc4_crtc_ops [vc4])
[ 5.559182] vc4-drm soc:gpu: bound 3f400000.hvs (ops vc4_hvs_ops [vc4])
[ 5.560703] vc4-drm soc:gpu: bound 3fc00000.v3d (ops vc4_v3d_ops [vc4])
[ 5.565043] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[ 5.565063] [drm] No driver support for vblank timestamp query.
[ 5.665355] vc4-drm soc:gpu: fb0: frame buffer device
[ 9.446963] [drm:drm_edid_block_valid [drm]] *ERROR* EDID checksum is invalid, remainder is 82
[ 9.483603] [drm:drm_edid_block_valid [drm]] *ERROR* EDID checksum is invalid, remainder is 25
root@raspberrypi:~#
dmesg - error related output
[ 625.909314] vc4-drm soc:gpu: failed to allocate buffer with size 1077248
[ 625.911516] [drm:vc4_validate_bin_cl [vc4]] *ERROR* 0x00000000: packet 112 (VC4_PACKET_TILE_BINNING_MODE_CONFIG) failed to validate
[ 625.912705] vc4-drm soc:gpu: failed to allocate buffer with size 3317760
[ 625.912784] vc4-drm soc:gpu: failed to allocate buffer with size 3317760
.... SAME OUTPUTS
[ 626.042469] vc4-drm soc:gpu: failed to allocate buffer with size 1069056
[ 626.042762] vc4-drm soc:gpu: failed to allocate buffer with size 1069056
[ 626.044928] [drm:vc4_validate_bin_cl [vc4]] *ERROR* 0x00000000: packet 112 (VC4_PACKET_TILE_BINNING_MODE_CONFIG) failed to validate
[ 626.046490] vc4-drm soc:gpu: failed to allocate buffer with size 1089536
[ 626.046957] vc4-drm soc:gpu: failed to allocate buffer with size 1089536
[ 628.001887] [drm] Resetting GPU.
[ 630.001909] [drm] Resetting GPU.
.... SAME OUTPUTS
[ 709.002437] [drm] Resetting GPU.
[ 710.002465] [drm] Resetting GPU.
[ 712.142637] [drm:vc4_validate_bin_cl [vc4]] *ERROR* 0x00000000: packet 112 (VC4_PACKET_TILE_BINNING_MODE_CONFIG) failed to validate
[ 713.002425] [drm] Resetting GPU.
[ 714.002439] [drm] Resetting GPU.
.... AND KEEP GOING UNTIL I RESET RPi3
firefox 's stderr
Performance warning: Async animation disabled because frame size (26600, 670) is bigger than the viewport (1620, 911) or the visual rectangle (26600, 670) is larger than the max allowable value (17895698) [ul]
Draw call returned Invalid argument. Expect corruption.
In another tryout i received below kernel panic besides the same outputs above and same hang situation
Message from syslogd@raspberrypi at Oct 12 18:00:38 ...
kernel:[ 174.066702] Internal error: Oops: 5 [#1] SMP ARM
Addition to that i also realized the following outputs by Xorg log file:
(EE) glamor0: GL error: FBO incomplete: driver marked FBO as incomplete [-1]
(EE) glamor0: GL error: FBO incomplete: driver marked FBO as incomplete [-1]
Hello I'm also having this error on Xorg Logs:
(EE) glamor0: GL error: FBO incomplete: driver marked FBO as incomplete -1 glamor0: GL error: FBO incomplete: driver marked FBO as incomplete [-1]
@lromor That's not an error, please ignore it.
Hopefully https://github.com/raspberrypi/linux/pull/1835 fixes a bunch of instability around CMA OOMs. Could you test if you're still having trouble with that?
By visiting the following page using Chromium (Version 48.0.2564.82 Built on Ubuntu 15.04, running on Raspbian 8.0, with hardware acceleration for WebGL), and setting the renderer option to "WebGL", my entire system completely froze 2 out of 3 times: http://brm.io/matter-js/demo/
Unfortunately, there's nothing in /var/log/kern.log from the crash itself, although it's full of messages like the following: