mripard / sunxi-mali

GNU General Public License v2.0
100 stars 54 forks source link

H3 - Hard Lockup while testing. #87

Closed kjngineering closed 3 years ago

kjngineering commented 3 years ago

Hi

So I've been working on getting the Mali blob to run on a target board; OrangePi Zero H3 Plus2. The goal is to get FBDEV acceleration using the existing binary blobs, currently a 800x800 LCD running over HDMI.

The build environment is Buildroot which is compiling kernel 5.4.13, Mali kernel drivers are per this repository r6p2, userspace is the corresponding Bootlin blob for fbdev.

CMA and DMA_CMA flags are enabled, CMA is set at 128mb, FBDEV_OVERALLOC is 200. Device tree is mainline and the device is probed successfully by mdev:

Starting mdev... OK
[    2.942554] mali: loading out-of-tree module taints kernel.
[    2.972766] Allwinner sunXi mali glue initialized
[    2.979874] Mali:
[    2.979882] Found Mali GPU Mali-400 MP r1p1
[    2.986758] Mali:
[    2.986762] 2+0 PP cores initialized
[    2.993136] Mali:
[    2.993141] Mali device driver loaded

I forked the glmark2-es2-fbdev code to my github and patched the waf libraries and config to compile under python3 so I could build a buildroot package.

The package is now compiling and being installed to the target file system. However when I run I get the following:

# glmark2-es2-fbdev
=======================================================
    glmark2 2014.03+git20150611.fa71af2d
=======================================================
    OpenGL Information
    GL_VENDOR:     ARM
    GL_RENDERER:   Mali-400 MP
    GL_VERSION:    OpenGL ES 2.0
=======================================================
[build] use-vbo=false:

Screen goes black. The system HARD locks, chip gets hot, nothing works! Cant force quit. I tried again with nohup, to see if we were at 100% cpu but nohup hangs in the same way.

I found test.c to test function, so I made a buildroot package for that.

So I compiled the test.c:

Similarly, the screen clears ready to draw, the board hard locks right in the middle of printing a string, the CPU starts to get hot as per the benchmark test.

# malitest
EGL Version: "1.4 Linux-r6p2-01rel0"
EGL Vendor: "ARM"
EGL Extensions: "EGL_KHR_image EGL_KHR_image_base EGL_KHR_image_pixmap EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_gl_renderbuffer_image EGL_KHR_reusable_sync EGL_KHR_fence_sync EGL_KHR_lock_surface EGL_KHR_lock_surface2 EGL_EXT_create_context_robustness EGL_ANDROID_blob_cache EGL_KHR_create_context EGL_KHR_partial_update EGL_KHR_create_context_no_error "
Surface size: 800x800
GL Vendor: "ARM"
GL Renderer: "Mali-400 MP"
GL Version: "OpenGL ES 2.0"
GL Extensions: "GL_OES_texture_npot GL_OES_vertex_array_object GL_OES_compressed_ETC1_RGB8_texture GL_EXT_compressed_ETC1_RGB8_sub_texture GL_OES_standard_derivatives GL_OES_EGL_image GL_OES_depth24 GL_ARM_rgba8 GL_ARM_mali_shader_binary GL_OES_depth_texture GL_OES_packed_depth_stencil GL_EXT_texture_format_BGRA8888 GL_OES_vertex_half_float GL_EXT_blend_minmax GL_OES_EGL_image_external GL_OES_EGL_sync GL_OES_rgb8_rgba8 GL_EXT_multisampled_render_to_texture GL_EXT_discard_framebuffer GL_OES_get_pr

I'm a bit lost now, any ideas on how I can diagnose this one? It's clear to me that the mali driver isn't working. But as for the reason I have no idea how to even troubleshoot it.

Thanks

mripard commented 3 years ago

Can you provide the entire boot logs?

kjngineering commented 3 years ago

Sure!

Here is the boot log: https://pastebin.com/4DGkzKZC

Also while doing the build I noticed a compiler message for the driver: Makefile:173: "You want to support DEVFREQ but kernel didn't support DEVFREQ."

Could it be a frequency control issue of the graphics core? Is there another kernel option I should enable?

I have attached the full build output of the driver here for you. https://pastebin.com/LxwXe8g8

mripard commented 3 years ago

Aside from the DEVFREQ comment, nothing really pops out. Can you try to enable it?

avafinger commented 3 years ago

@mripard Can you update the patch set to 5.10? If so I can have a look at fbdev and try to help.

kjngineering commented 3 years ago

Aside from the DEVFREQ comment, nothing really pops out. Can you try to enable it?

I enabled: PM_DEVFREQ DEVFREQ_GOV_SIMPLE_ONDEMAND

and rebuilt the kernel and the driver. The DEVFREQ message at build time went away, but the hard lockup on testing remained.

I only have one H3 board with a HDMI out - I am wondering if I should buy another to test.

giuliobenetti commented 3 years ago

@avafinger do you have build failure with Linux 5.10? I've just tried building using Buildroot sunxi-mali with Linux 5.10 and it built fine.

avafinger commented 3 years ago

Yes, i had some compiler errors in 5.9 and 5.10-rc1 i could not solve on A64. Doing a building right now to check the error again on 5.10-rc7.

kjngineering commented 3 years ago

@avafinger do you have build failure with Linux 5.10? I've just tried building using Buildroot sunxi-mali with Linux 5.10 and it built fine.

Hi Giulio,

I also tried against 5.10, the driver has a build failure in "mali_kernel_linux.c". I had success at 5.7.19, I have not tried any 5.8 or 5.9 kernels.

The build log for the driver under Buildroot/5.10 is here

avafinger commented 3 years ago

Here is the error during the build process. Building for the A64, Kernel 5.10.0-rc7, ignore the directory names:

building...
/apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2 /apps/arm/friendlywrt-rk3328/sunxi-mali
Applying patch ../patches/0001-makefile-Add-install-target-and-build-the-module-by-.patch
patching file src/devicedrv/mali/Makefile
Hunk #1 succeeded at 193 (offset 16 lines).

Applying patch ../patches/0002-mali-Support-building-against-4.6.patch
patching file src/devicedrv/mali/linux/mali_memory_swap_alloc.c

Applying patch ../patches/0003-mali-Support-building-against-4.8.patch
patching file src/devicedrv/mali/linux/mali_memory_os_alloc.c
Hunk #2 succeeded at 515 (offset 7 lines).
Hunk #3 succeeded at 558 (offset 7 lines).
Hunk #4 succeeded at 618 (offset 7 lines).
Hunk #5 succeeded at 772 (offset 7 lines).

Applying patch ../patches/0004-mali-Print-the-mali-version-at-probe.patch
patching file src/devicedrv/mali/common/mali_kernel_core.c

Applying patch ../patches/0005-mali-Add-sunxi-platform.patch
patching file src/devicedrv/mali/platform/sunxi/sunxi.c

Applying patch ../patches/r6p2/0006-mali-Allow-devfreq-to-run-without-power-models.patch
patching file src/devicedrv/mali/linux/mali_devfreq.c

Applying patch ../patches/0007-mali-support-building-against-4.10.patch
patching file src/devicedrv/mali/linux/mali_memory.c

Applying patch ../patches/0008-mali-support-building-against-4.11.patch
patching file src/devicedrv/mali/linux/mali_memory.c

Applying patch ../patches/r6p2/0009-mali-Fix-user-memory-domain-fault.patch
patching file src/devicedrv/mali/common/mali_gp_job.c

Applying patch ../patches/0010-mali-support-building-against-4.12.patch
patching file src/devicedrv/mali/linux/mali_osk_specific.h

Applying patch ../patches/r6p2/0011-mali-support-building-against-4.13.patch
patching file src/devicedrv/mali/linux/mali_kernel_linux.h

Applying patch ../patches/0012-mali-support-building-against-4.14.patch
patching file src/devicedrv/mali/linux/mali_memory_swap_alloc.c

Applying patch ../patches/r6p2/0013-mali-support-building-against-4.15.patch
patching file src/devicedrv/mali/common/mali_control_timer.c
patching file src/devicedrv/mali/common/mali_group.c
patching file src/devicedrv/mali/common/mali_osk_types.h
patching file src/devicedrv/mali/linux/mali_memory_os_alloc.c
patching file src/devicedrv/mali/linux/mali_osk_timers.c

Applying patch ../patches/r6p2/0014-mali-Make-devfreq-optional.patch
patching file src/devicedrv/mali/linux/mali_devfreq.c

Applying patch ../patches/0015-Enable-parallel-building-passing-variable-to-Makefile.patch
patching file src/devicedrv/mali/Makefile

Applying patch ../patches/r6p2/0016-mali-support-building-against-4.16.patch
patching file src/devicedrv/mali/linux/mali_memory_secure.c

Applying patch ../patches/0018-mali-support-building-against-4.20.patch
patching file src/devicedrv/mali/linux/mali_kernel_linux.c
Hunk #1 succeeded at 1125 (offset 193 lines).
patching file src/devicedrv/mali/linux/mali_kernel_linux.h
Hunk #1 succeeded at 16 with fuzz 1.
Hunk #2 succeeded at 34 (offset 5 lines).
patching file src/devicedrv/mali/linux/mali_osk_time.c

Applying patch ../patches/0019-mali-support-building-against-5.0.patch
patching file src/devicedrv/mali/linux/mali_kernel_linux.h
Hunk #1 succeeded at 43 (offset 5 lines).
patching file src/devicedrv/mali/linux/mali_ukk_mem.c

Applying patch ../patches/0020-mali-support-building-against-4.17.patch
patching file src/devicedrv/mali/linux/mali_memory.c

Applying patch ../patches/0021-mali-support-building-against-5.3.patch
patching file src/devicedrv/mali/linux/mali_osk_time.c

Applying patch ../patches/0022-mali-support-building-against-5.6.patch
patching file src/devicedrv/mali/linux/mali_osk_time.c
patching file src/devicedrv/mali/linux/mali_osk_low_level_mem.c
patching file src/devicedrv/mali/linux/mali_memory_cow.c

Applying patch ../patches/0023-mali-support-building-against-5.7.patch
patching file src/devicedrv/mali/linux/mali_memory_dma_buf.c

Now at patch ../patches/0023-mali-support-building-against-5.7.patch
/apps/arm/friendlywrt-rk3328/sunxi-mali
make: Entering directory '/apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali'
make -j2 ARCH=arm64 -C /apps/arm/friendlywrt-rk3328/linux-5.10-rc1 M=/apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali modules
make[1]: Entering directory '/apps/arm/friendlywrt-rk3328/linux-5.10-rc1'
  CC [M]  /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_atomics.o
  CC [M]  /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_irq.o
  CC [M]  /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_wq.o
  CC [M]  /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_locks.o
  CC [M]  /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_wait_queue.o
  CC [M]  /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_low_level_mem.o
  CC [M]  /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_math.o
  CC [M]  /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_memory.o
  CC [M]  /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_misc.o
In file included from ./arch/arm64/include/asm/uaccess.h:11:0,
                 from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_misc.c:16:
./arch/arm64/include/asm/kernel-pgtable.h:132:31: warning: "PUD_SHIFT" is not defined, evaluates to 0 [-Wundef]
 #define ARM64_MEMSTART_SHIFT  PUD_SHIFT
                               ^
./arch/arm64/include/asm/kernel-pgtable.h:145:42: note: in expansion of macro ‘ARM64_MEMSTART_SHIFT’
 #if defined(CONFIG_SPARSEMEM_VMEMMAP) && ARM64_MEMSTART_SHIFT < SECTION_SIZE_BITS
                                          ^~~~~~~~~~~~~~~~~~~~
In file included from ./arch/arm64/include/asm/uaccess.h:22:0,
                 from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_misc.c:16:
./arch/arm64/include/asm/mmu.h:51:55: error: unknown type name ‘bp_hardening_data’; did you mean ‘bp_hardening_cb_t’?
 DECLARE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
                                                       ^~~~~~~~~~~~~~~~~
                                                       bp_hardening_cb_t
./arch/arm64/include/asm/mmu.h: In function ‘arm64_get_bp_hardening_data’:
./arch/arm64/include/asm/mmu.h:55:9: error: implicit declaration of function ‘this_cpu_ptr’; did you mean ‘this_cpu_has_cap’? [-Werror=implicit-function-declaration]
  return this_cpu_ptr(&bp_hardening_data);
         ^~~~~~~~~~~~
         this_cpu_has_cap
./arch/arm64/include/asm/mmu.h:55:23: error: ‘bp_hardening_data’ undeclared (first use in this function)
  return this_cpu_ptr(&bp_hardening_data);
                       ^~~~~~~~~~~~~~~~~
./arch/arm64/include/asm/mmu.h:55:23: note: each undeclared identifier is reported only once for each function it appears in
./arch/arm64/include/asm/mmu.h: At top level:
./arch/arm64/include/asm/mmu.h:77:11: error: unknown type name ‘pgprot_t’; did you mean ‘pgoff_t’?
           pgprot_t prot, bool page_mappings_only);
           ^~~~~~~~
           pgoff_t
./arch/arm64/include/asm/mmu.h:78:63: error: unknown type name ‘pgprot_t’; did you mean ‘pgoff_t’?
 extern void *fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot);
                                                               ^~~~~~~~
                                                               pgoff_t
In file included from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_misc.c:16:0:
./arch/arm64/include/asm/uaccess.h:29:27: error: unknown type name ‘mm_segment_t’; did you mean ‘mm_context_t’?
 static inline void set_fs(mm_segment_t fs)
                           ^~~~~~~~~~~~
                           mm_context_t
./arch/arm64/include/asm/uaccess.h: In function ‘__range_ok’:
./arch/arm64/include/asm/uaccess.h:64:29: error: implicit declaration of function ‘current_thread_info’ [-Werror=implicit-function-declaration]
  unsigned long ret, limit = current_thread_info()->addr_limit;
                             ^~~~~~~~~~~~~~~~~~~
./arch/arm64/include/asm/uaccess.h:64:50: error: invalid type argument of ‘->’ (have ‘int’)
  unsigned long ret, limit = current_thread_info()->addr_limit;
                                                  ^~
./arch/arm64/include/asm/uaccess.h:72:7: error: ‘current’ undeclared (first use in this function)
      (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
       ^~~~~~~
./arch/arm64/include/asm/uaccess.h:72:24: error: ‘PF_KTHREAD’ undeclared (first use in this function); did you mean ‘__HEAD’?
      (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
                        ^~~~~~~~~~
                        __HEAD
./arch/arm64/include/asm/uaccess.h:72:38: error: implicit declaration of function ‘test_thread_flag’ [-Werror=implicit-function-declaration]
      (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
                                      ^~~~~~~~~~~~~~~~
./arch/arm64/include/asm/uaccess.h:72:55: error: ‘TIF_TAGGED_ADDR’ undeclared (first use in this function); did you mean ‘KIMAGE_VADDR’?
      (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
                                                       ^~~~~~~~~~~~~~~
                                                       KIMAGE_VADDR
./arch/arm64/include/asm/uaccess.h: In function ‘__uaccess_mask_ptr’:
./arch/arm64/include/asm/uaccess.h:240:41: error: invalid type argument of ‘->’ (have ‘int’)
  : "r" (ptr), "r" (current_thread_info()->addr_limit),
                                         ^~
  CC [M]  /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_mali.o
In file included from ./include/linux/sched/task.h:11:0,
                 from ./include/linux/sched/signal.h:9,
                 from ./include/linux/rcuwait.h:6,
                 from ./include/linux/percpu-rwsem.h:7,
                 from ./include/linux/fs.h:33,
                 from ./include/linux/huge_mm.h:8,
                 from ./include/linux/mm.h:687,
                 from ./include/linux/kallsyms.h:12,
                 from ./include/linux/ftrace.h:11,
                 from ./include/linux/kprobes.h:29,
                 from ./include/linux/kgdb.h:19,
                 from ./arch/arm64/include/asm/cacheflush.h:11,
                 from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_misc.c:17:
./include/linux/uaccess.h: In function ‘force_uaccess_begin’:
./include/linux/uaccess.h:23:2: error: implicit declaration of function ‘set_fs’; did you mean ‘get_fs’? [-Werror=implicit-function-declaration]
  set_fs(USER_DS);
  ^~~~~~
  get_fs
In file included from ./arch/arm64/include/asm/uaccess.h:11:0,
                 from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_mali.c:16:
./arch/arm64/include/asm/kernel-pgtable.h:132:31: warning: "PUD_SHIFT" is not defined, evaluates to 0 [-Wundef]
 #define ARM64_MEMSTART_SHIFT  PUD_SHIFT
                               ^
./arch/arm64/include/asm/kernel-pgtable.h:145:42: note: in expansion of macro ‘ARM64_MEMSTART_SHIFT’
 #if defined(CONFIG_SPARSEMEM_VMEMMAP) && ARM64_MEMSTART_SHIFT < SECTION_SIZE_BITS
                                          ^~~~~~~~~~~~~~~~~~~~
In file included from ./arch/arm64/include/asm/uaccess.h:22:0,
                 from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_mali.c:16:
./arch/arm64/include/asm/mmu.h:51:55: error: unknown type name ‘bp_hardening_data’; did you mean ‘bp_hardening_cb_t’?
 DECLARE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
                                                       ^~~~~~~~~~~~~~~~~
                                                       bp_hardening_cb_t
./arch/arm64/include/asm/mmu.h: In function ‘arm64_get_bp_hardening_data’:
./arch/arm64/include/asm/mmu.h:55:9: error: implicit declaration of function ‘this_cpu_ptr’; did you mean ‘this_cpu_has_cap’? [-Werror=implicit-function-declaration]
  return this_cpu_ptr(&bp_hardening_data);
         ^~~~~~~~~~~~
         this_cpu_has_cap
./arch/arm64/include/asm/mmu.h:55:23: error: ‘bp_hardening_data’ undeclared (first use in this function)
  return this_cpu_ptr(&bp_hardening_data);
                       ^~~~~~~~~~~~~~~~~
./arch/arm64/include/asm/mmu.h:55:23: note: each undeclared identifier is reported only once for each function it appears in
./arch/arm64/include/asm/mmu.h: At top level:
./arch/arm64/include/asm/mmu.h:77:11: error: unknown type name ‘pgprot_t’; did you mean ‘pgoff_t’?
           pgprot_t prot, bool page_mappings_only);
           ^~~~~~~~
           pgoff_t
./arch/arm64/include/asm/mmu.h:78:63: error: unknown type name ‘pgprot_t’; did you mean ‘pgoff_t’?
 extern void *fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot);
                                                               ^~~~~~~~
                                                               pgoff_t
In file included from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_mali.c:16:0:
./arch/arm64/include/asm/uaccess.h:29:27: error: unknown type name ‘mm_segment_t’; did you mean ‘mm_context_t’?
 static inline void set_fs(mm_segment_t fs)
                           ^~~~~~~~~~~~
                           mm_context_t
./arch/arm64/include/asm/uaccess.h: In function ‘__range_ok’:
./arch/arm64/include/asm/uaccess.h:64:29: error: implicit declaration of function ‘current_thread_info’ [-Werror=implicit-function-declaration]
  unsigned long ret, limit = current_thread_info()->addr_limit;
                             ^~~~~~~~~~~~~~~~~~~
./arch/arm64/include/asm/uaccess.h:64:50: error: invalid type argument of ‘->’ (have ‘int’)
  unsigned long ret, limit = current_thread_info()->addr_limit;
                                                  ^~
./arch/arm64/include/asm/uaccess.h:72:7: error: ‘current’ undeclared (first use in this function)
      (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
       ^~~~~~~
./arch/arm64/include/asm/uaccess.h:72:24: error: ‘PF_KTHREAD’ undeclared (first use in this function); did you mean ‘__HEAD’?
      (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
                        ^~~~~~~~~~
                        __HEAD
./arch/arm64/include/asm/uaccess.h:72:38: error: implicit declaration of function ‘test_thread_flag’ [-Werror=implicit-function-declaration]
      (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
                                      ^~~~~~~~~~~~~~~~
./arch/arm64/include/asm/uaccess.h:72:55: error: ‘TIF_TAGGED_ADDR’ undeclared (first use in this function); did you mean ‘KIMAGE_VADDR’?
      (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
                                                       ^~~~~~~~~~~~~~~
                                                       KIMAGE_VADDR
./arch/arm64/include/asm/uaccess.h: In function ‘__uaccess_mask_ptr’:
./arch/arm64/include/asm/uaccess.h:240:41: error: invalid type argument of ‘->’ (have ‘int’)
  : "r" (ptr), "r" (current_thread_info()->addr_limit),
                                         ^~
cc1: some warnings being treated as errors
scripts/Makefile.build:279: recipe for target '/apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_misc.o' failed
make[2]: *** [/apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_misc.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from ./include/linux/sched/task.h:11:0,
                 from ./include/linux/sched/signal.h:9,
                 from ./include/linux/rcuwait.h:6,
                 from ./include/linux/percpu-rwsem.h:7,
                 from ./include/linux/fs.h:33,
                 from ./include/linux/seq_file.h:11,
                 from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/common/mali_osk.h:19,
                 from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/common/mali_group.h:14,
                 from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/common/mali_pm_metrics.h:16,
                 from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/include/linux/mali/mali_utgard.h:22,
                 from /apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_mali.c:18:
./include/linux/uaccess.h: In function ‘force_uaccess_begin’:
./include/linux/uaccess.h:23:2: error: implicit declaration of function ‘set_fs’; did you mean ‘get_fs’? [-Werror=implicit-function-declaration]
  set_fs(USER_DS);
  ^~~~~~
  get_fs
cc1: some warnings being treated as errors
scripts/Makefile.build:279: recipe for target '/apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_mali.o' failed
make[2]: *** [/apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali/linux/mali_osk_mali.o] Error 1
Makefile:1805: recipe for target '/apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali' failed
make[1]: *** [/apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali] Error 2
make[1]: Leaving directory '/apps/arm/friendlywrt-rk3328/linux-5.10-rc1'
Makefile:197: recipe for target 'modules' failed
make: *** [modules] Error 2
make: Leaving directory '/apps/arm/friendlywrt-rk3328/sunxi-mali/r6p2/src/devicedrv/mali'
Error building the driver
giuliobenetti commented 3 years ago

@avafinger @kjngineering I've opened a new Issue for this: https://github.com/mripard/sunxi-mali/issues/88

@mripard do you mind if I take care to solve linux 5.10 build failure?

kjngineering commented 3 years ago

Aside from the DEVFREQ comment, nothing really pops out. Can you try to enable it?

Hi Maxime. After reading through several other issues this seems the same problem: https://github.com/mripard/sunxi-mali/issues/54#issuecomment-583716509 https://github.com/mripard/sunxi-mali/issues/56#issuecomment-449844912 https://github.com/mripard/sunxi-mali/issues/56#issuecomment-449845590

The notes indicate a regression in 4.19, fixed in 4.20. Could we be seeing something similar again?

Edit: as per some of those comments I checked the stability of the power supply: 3v3 was at 3340mV and 5v0 was at 4890mV

5v0 was dropping a little as soon as I start any mali app (malitest or glmark2) just before the crash, but this is not surprising as the chip is basically idle. I think my supply is pretty stable and not the issue. I will double check with a bench supply later.

avafinger commented 3 years ago

@giuliobenetti Looks like the buildroot have some patches applied against 5.10, for example, the case of set_fs(USER_DS); I can vaguely remember doing something about this on 5.3, 5.4, and 5.7 . I can fix my kernel, not a big deal though.

avafinger commented 3 years ago

@kjngineering You need to set DRM_FBDEV_LEAK_PHYS_SMEM=y

giuliobenetti commented 3 years ago

@avafinger I've made it build without that macro. Here is a macro-patch to be rebased: https://pastebin.com/u2PnCvM7

AND this: https://pastebin.com/Ppparc32

Can you try applying those 2 and check if it builds correctly?

I'm going to finish tomorrow.

Thank you

kjngineering commented 3 years ago

DRM_FBDEV_LEAK_PHYS_SMEM=y

AVAFINGER - SO CLOSE!

This option is hidden in Buildroot but can be enabled with: General Setup > Configure standard kernel features (expert users) (e.g. EXPERT=y). It then appears under the _FBDEVOVERALLOC Kconfig menu option.

This itself is not enough: there is a broken part of code in the drm_fb_helper.c that must manually be changed at build time. Many thanks to the users over at whycan.cn that had documented this.

/*
 * In order to keep user-space compatibility, we want in certain use-cases
 * to keep leaking the fbdev physical address to the user-space program
 * handling the fbdev buffer.
 * This is a bad habit essentially kept into closed source opengl driver
 * that should really be moved into open-source upstream projects instead
 * of using legacy physical addresses in user space to communicate with
 * other out-of-tree kernel modules.
 *
 * This module_param *should* be removed as soon as possible and be
 * considered as a broken and legacy behaviour from a modern fbdev device.
 */
#if IS_ENABLED(CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM)
static bool drm_leak_fbdev_smem = false;
module_param_unsafe(drm_leak_fbdev_smem, bool, 0600);
MODULE_PARM_DESC(drm_leak_fbdev_smem,
         "Allow unsafe leaking fbdev physical smem address [default=false]");
#endif

Change: static bool drm_leak_fbdev_smem = false;

static bool drm_leak_fbdev_smem = true;

and recompile the kernel. I how have fully functioning Mali mainline in FBDEV:

# malitest
EGL Version: "1.4 Linux-r6p2-01rel0"
EGL Vendor: "ARM"
EGL Extensions: "EGL_KHR_image EGL_KHR_image_base EGL_KHR_image_pixmap EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_gl_renderbuffer_image EGL_KHR_reusable_sync EGL_KHR_fence_sync EGL_KHR_lock_surface EGL_KHR_lock_surface2 EGL_EXT_create_context_robustness EGL_ANDROID_blob_cache EGL_KHR_create_context EGL_KHR_partial_update EGL_KHR_create_context_no_error "
Surface size: 800x800
GL Vendor: "ARM"
GL Renderer: "Mali-400 MP"
GL Version: "OpenGL ES 2.0"
GL Extensions: "GL_OES_texture_npot GL_OES_vertex_array_object GL_OES_compressed_ETC1_RGB8_texture GL_EXT_compressed_ETC1_RGB8_sub_texture GL_OES_standard_derivatives GL_OES_EGL_image GL_OES_depth24 GL_ARM_rgba8 GL_ARM_mali_shader_binary GL_OES_depth_texture GL_OES_packed_depth_stencil GL_EXT_texture_format_BGRA8888 GL_OES_vertex_half_float GL_EXT_blend_minmax GL_OES_EGL_image_external GL_OES_EGL_sync GL_OES_rgb8_rgba8 GL_EXT_multisampled_render_to_texture GL_EXT_discard_framebuffer GL_OES_get_program_binary GL_ARM_mali_program_binary GL_EXT_shader_texture_lod GL_EXT_robustness GL_OES_depth_texture_cube_map GL_KHR_debug GL_ARM_shader_framebuffer_fetch GL_ARM_shader_framebuffer_fetch_depth_stencil GL_OES_mapbuffer GL_KHR_no_error"

glmark2 was limited to 60fps (HDMI refresh speed it looks like)

=======================================================
                                  glmark2 Score: 55
=======================================================

The full glmark2-fbdev result is here

So there is a patch that needs to be pushed back to mainline to fix the offending DRM code for this to function -and- some notes should probably be made on the README for this repository.

giuliobenetti commented 3 years ago

@kjngineering I'm not sure that's a bug since it's written that is unsafe when enabled, this is why(I think) they've added it as a module_param. That way you could be able to enable it while loading module or during boot. So I'd rather try to pass drm_leak_fbdev_smem=1 while loading fbdev(rare) or in bootargs/dts.

kjngineering commented 3 years ago

@kjngineering I'm not sure that's a bug since it's written that is unsafe when enabled, this is why(I think) they've added it as a module_param. That way you could be able to enable it while loading module or during boot. So I'd rather try to pass drm_leak_fbdev_smem=1 while loading fbdev(rare) or in bootargs/dts.

the actual option at build time "CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM" is hard enough to enable and provides sufficient warning. But I will try with a bootarg and see what happens.

kjngineering commented 3 years ago

I attempted to leave _drm_fbhelper.c unmodified and instead pass _drm_leak_fbdevsmem=1 as a boot argument (in this case in the bootscript).

[    0.000000] Kernel command line: console=ttyS0,115200 earlyprintk root=/dev/mmcblk0p2 drm_leak_fbdev_smem=1 rootwait
[    0.000000] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.000000] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 410280K/491520K available (7168K kernel code, 491K rwdata, 1800K rodata, 1024K init, 245K bss, 15704K reserved, 65536K cma-reserved, 0K highmem)

Unfortunately this resulted in the same hanging behavior as what started this thread. I am happy to try anything else @giuliobenetti

Edit: _drm_kms_helper.drm_leak_fbdevsmem=1 Success! This must be set as a bootarg, CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM must be set at build time. The flag is under the drm_ksm_helper module, which is the missing piece of the puzzle.

mripard commented 3 years ago

Thanks, I've added it to the README so it's clear for the next users

avafinger commented 3 years ago

Well, it was mentioned here: https://github.com/mripard/sunxi-mali/issues/56#issuecomment-450351976