directfb2 / DirectFB2

Core DirectFB library
GNU Lesser General Public License v2.1
132 stars 15 forks source link

DMS/KRM: Core/Parts: Could not initialize 'system_core' core! #108

Closed kjngineering closed 1 year ago

kjngineering commented 1 year ago

Hi,

Thanks for reviving this project, I used DirectFB a few years ago on a project to glue SDL to a cheap SPI display on an embedded system, it wasn't the most performant thing in the world, but got the job done. The revival and improvements made thus far make this worth considering again - keep up the good work!

I've just done my first environment test and tried to do a test using direct DRM/KMS (e.g. card0) but I am now getting an segmentation fault, and it is not clear what the issue is.

# df_andi

   ~~~~~~~~~~~~~~~~~~~~~~~~~~| DirectFB 2.0.0  |~~~~~~~~~~~~~~~~~~~~~~~~~~
        (c) 2017-2023  DirectFB2 Open Source Project (fork of DirectFB)
        (c) 2012-2016  DirectFB integrated media GmbH
        (c) 2001-2016  The world wide DirectFB Open Source Community
        (c) 2000-2004  Convergence (integrated media) GmbH
      ----------------------------------------------------------------

(*) DirectFB/Core: Single Application Core. (2023-04-08 11:49)
(*) DRMKMS/System: Using device /dev/dri/card0 (default)
(*) DRMKMS/System: Found 1 connectors, 1 encoders, 3 planes
(!) DRMKMS/System: No supported format!
(!) Core/Parts: Could not initialize 'system_core' core!
    --> A general initialization error occurred
Segmentation fault
#

Display is a RGB666/18-bit "MIPI-DPI" display which is running on an Allwinner sun4i-drm - DE2 interface.

The above test works fine if directfb2 is compiled with the fbdev option and piped to dev/fb0 (using DRM_FBDEV_EMULATUON) - and almost 2x the performance of DirectFB(1).

caramelli commented 1 year ago

Thank you for your consideration on DirectFB.

What output do you get when running the modetest tool provided by the libdrm library?

kjngineering commented 1 year ago

sorry, I did do a modetest previously, I should have included it.

But I didn't look specifically at the colour modes - would I be right understanding that none of the modes listed for the planes support a RGB colour mode? (even if they do?)

# modetest
trying to open device 'i915'...failed
trying to open device 'amdgpu'...failed
trying to open device 'radeon'...failed
trying to open device 'nouveau'...failed
trying to open device 'vmwgfx'...failed
trying to open device 'omapdrm'...failed
trying to open device 'exynos'...failed
trying to open device 'tilcdc'...failed
trying to open device 'msm'...failed
trying to open device 'sti'...failed
trying to open device 'tegra'...failed
trying to open device 'imx-drm'...failed
trying to open device 'rockchip'...failed
trying to open device 'atmel-hlcdc'...failed
trying to open device 'fsl-dcu-drm'...failed
trying to open device 'vc4'...failed
trying to open device 'virtio_gpu'...failed
trying to open device 'mediatek'...failed
trying to open device 'meson'...failed
trying to open device 'pl111'...failed
trying to open device 'stm'...failed
trying to open device 'sun4i-drm'...done
Encoders:
id      crtc    type    possible crtcs  possible clones
46      0       none    0x00000001      0x00000001

Connectors:
id      encoder status          name            size (mm)       modes   encoders
47      0       connected       Unknown-1       154x86          1       46
  modes:
        index name refresh (Hz) hdisp hss hse htot vdisp vss vse vtot
  #0 800x480 60.06 800 1010 1012 1056 480 502 504 525 33300 flags: nhsync, nvsync; type: preferred, driver
  props:
        1 EDID:
                flags: immutable blob
                blobs:

                value:
        2 DPMS:
                flags: enum
                enums: On=0 Standby=1 Suspend=2 Off=3
                value: 0
        5 link-status:
                flags: enum
                enums: Good=0 Bad=1
                value: 0
        6 non-desktop:
                flags: immutable range
                values: 0 1
                value: 0
        4 TILE:
                flags: immutable blob
                blobs:

                value:

CRTCs:
id      fb      pos     size
45      0       (0,0)   (0x0)
  #0  nan 0 0 0 0 0 0 0 0 0 flags: ; type:
  props:
        24 VRR_ENABLED:
                flags: range
                values: 0 1
                value: 0

Planes:
id      crtc    fb      CRTC x,y        x,y     gamma size      possible crtcs
31      0       0       0,0             0,0     0               0x00000001
  formats: BG16 BG24 BX12 BX15 BX24 RG16 RG24 RX12 RX15 RX24 XB15 XB12 XB24 XR15 XR12 XR24 NV16 NV12 NV21 NV61 UYVY VYUY YUYV YVYU YU11 YU12 YU16 YV11 YV12 YV16
  props:
        8 type:
                flags: immutable enum
                enums: Overlay=0 Primary=1 Cursor=2
                value: 0
        30 IN_FORMATS:
                flags: immutable blob
                blobs:

                value:
                        01000000000000001e00000018000000
                        01000000900000004247313642473234
                        42583132425831354258323452473136
                        52473234525831325258313552583234
                        58423135584231325842323458523135
                        58523132585232344e5631364e563132
                        4e5632314e5636315559565956595559
                        59555956595659555955313159553132
                        59553136595631315956313259563136
                        ffffff3f000000000000000000000000
                        0000000000000000
                in_formats blob decoded:
                         BG16:  LINEAR
                         BG24:  LINEAR
                         BX12:  LINEAR
                         BX15:  LINEAR
                         BX24:  LINEAR
                         RG16:  LINEAR
                         RG24:  LINEAR
                         RX12:  LINEAR
                         RX15:  LINEAR
                         RX24:  LINEAR
                         XB15:  LINEAR
                         XB12:  LINEAR
                         XB24:  LINEAR
                         XR15:  LINEAR
                         XR12:  LINEAR
                         XR24:  LINEAR
                         NV16:  LINEAR
                         NV12:  LINEAR
                         NV21:  LINEAR
                         NV61:  LINEAR
                         UYVY:  LINEAR
                         VYUY:  LINEAR
                         YUYV:  LINEAR
                         YVYU:  LINEAR
                         YU11:  LINEAR
                         YU12:  LINEAR
                         YU16:  LINEAR
                         YV11:  LINEAR
                         YV12:  LINEAR
                         YV16:  LINEAR
        33 zpos:
                flags: range
                values: 0 2
                value: 0
        34 COLOR_ENCODING:
                flags: enum
                enums: ITU-R BT.601 YCbCr=0 ITU-R BT.709 YCbCr=1
                value: 1
        35 COLOR_RANGE:
                flags: enum
                enums: YCbCr limited range=0 YCbCr full range=1
                value: 0
36      0       0       0,0             0,0     0               0x00000001
  formats: BG16 BG24 BX12 BX15 BX24 RG16 RG24 RX12 RX15 RX24 XB15 XB12 XB24 XR15 XR12 XR24 NV16 NV12 NV21 NV61 UYVY VYUY YUYV YVYU YU11 YU12 YU16 YV11 YV12 YV16
  props:
        8 type:
                flags: immutable enum
                enums: Overlay=0 Primary=1 Cursor=2
                value: 0
        30 IN_FORMATS:
                flags: immutable blob
                blobs:

                value:
                        01000000000000001e00000018000000
                        01000000900000004247313642473234
                        42583132425831354258323452473136
                        52473234525831325258313552583234
                        58423135584231325842323458523135
                        58523132585232344e5631364e563132
                        4e5632314e5636315559565956595559
                        59555956595659555955313159553132
                        59553136595631315956313259563136
                        ffffff3f000000000000000000000000
                        0000000000000000
                in_formats blob decoded:
                         BG16:  LINEAR
                         BG24:  LINEAR
                         BX12:  LINEAR
                         BX15:  LINEAR
                         BX24:  LINEAR
                         RG16:  LINEAR
                         RG24:  LINEAR
                         RX12:  LINEAR
                         RX15:  LINEAR
                         RX24:  LINEAR
                         XB15:  LINEAR
                         XB12:  LINEAR
                         XB24:  LINEAR
                         XR15:  LINEAR
                         XR12:  LINEAR
                         XR24:  LINEAR
                         NV16:  LINEAR
                         NV12:  LINEAR
                         NV21:  LINEAR
                         NV61:  LINEAR
                         UYVY:  LINEAR
                         VYUY:  LINEAR
                         YUYV:  LINEAR
                         YVYU:  LINEAR
                         YU11:  LINEAR
                         YU12:  LINEAR
                         YU16:  LINEAR
                         YV11:  LINEAR
                         YV12:  LINEAR
                         YV16:  LINEAR
        38 zpos:
                flags: range
                values: 0 2
                value: 1
        39 COLOR_ENCODING:
                flags: enum
                enums: ITU-R BT.601 YCbCr=0 ITU-R BT.709 YCbCr=1
                value: 1
        40 COLOR_RANGE:
                flags: enum
                enums: YCbCr limited range=0 YCbCr full range=1
                value: 0
41      0       0       0,0             0,0     0               0x00000001
  formats: AB15 AB12 AB24 AR15 AR12 AR24 BG16 BG24 BA15 BA12 BA24 BX24 RG16 RG24 RA12 RA15 RA24 RX24 XB24 XR24
  props:
        8 type:
                flags: immutable enum
                enums: Overlay=0 Primary=1 Cursor=2
                value: 1
        30 IN_FORMATS:
                flags: immutable blob
                blobs:

                value:
                        01000000000000001400000018000000
                        01000000680000004142313541423132
                        41423234415231354152313241523234
                        42473136424732344241313542413132
                        42413234425832345247313652473234
                        52413132524131355241323452583234
                        5842323458523234ffff0f0000000000
                        00000000000000000000000000000000
                in_formats blob decoded:
                         AB15:  LINEAR
                         AB12:  LINEAR
                         AB24:  LINEAR
                         AR15:  LINEAR
                         AR12:  LINEAR
                         AR24:  LINEAR
                         BG16:  LINEAR
                         BG24:  LINEAR
                         BA15:  LINEAR
                         BA12:  LINEAR
                         BA24:  LINEAR
                         BX24:  LINEAR
                         RG16:  LINEAR
                         RG24:  LINEAR
                         RA12:  LINEAR
                         RA15:  LINEAR
                         RA24:  LINEAR
                         RX24:  LINEAR
                         XB24:  LINEAR
                         XR24:  LINEAR
        43 alpha:
                flags: range
                values: 0 65535
                value: 65535
        44 zpos:
                flags: range
                values: 0 2
                value: 2

Frame buffers:
id      size    pitch
caramelli commented 1 year ago

OK, you can try with this change https://github.com/directfb2/DirectFB2/pull/109

kjngineering commented 1 year ago

That commit fixed it! Good work!

# df_andi

   ~~~~~~~~~~~~~~~~~~~~~~~~~~| DirectFB 2.0.0  |~~~~~~~~~~~~~~~~~~~~~~~~~~
        (c) 2017-2023  DirectFB2 Open Source Project (fork of DirectFB)
        (c) 2012-2016  DirectFB integrated media GmbH
        (c) 2001-2016  The world wide DirectFB Open Source Community
        (c) 2000-2004  Convergence (integrated media) GmbH
      ----------------------------------------------------------------

(*) DirectFB/Core: Single Application Core. (2023-04-10 01:02)
(*) DRMKMS/System: Using device /dev/dri/card0 (default)
(*) DRMKMS/System: Found 1 connectors, 1 encoders, 3 planes
(*) DirectFB/Input: Hot-plug detection enabled with Linux Input
(*) DirectFB/Genefx: NEON enabled
(*) DirectFB/Graphics: Genefx Software Rasterizer 0.7 (DirectFB)
(*) DRMKMS/Screen: Default mode is 800x480 (1 modes in total)
(*) DRMKMS/Layer: Supported properties for layer id 36
(*)      zpos
(*) DRMKMS/Layer: Supported properties for layer id 41
(*)      alpha
(*)      zpos
(*) DirectFB/Core/WM: Default 0.3 (DirectFB)
(*) Direct/Interface: Loaded 'DGIFF' implementation of 'IDirectFBFont'
(*) Direct/Interface: Loaded 'DFIFF' implementation of 'IDirectFBImageProvider'

Performance is not great - dfandi is getting ~9.2FPS 0.0% CPU Idle_ on DRM/KMS

FBDev implementation (using DRM FBDEV Emulation layer) is ~43FPS 0.0% CPU Idle

DirectFB(1.7) using FBDev (again DRM emulation layer) is ~27FPS 30.0% CPU Idle

kjngineering commented 1 year ago

Benchmarks using df_dok - no other changes other than one compiled for fbdev, one for DRM/KMS.

# df_dok

   ~~~~~~~~~~~~~~~~~~~~~~~~~~| DirectFB 2.0.0  |~~~~~~~~~~~~~~~~~~~~~~~~~~
        (c) 2017-2023  DirectFB2 Open Source Project (fork of DirectFB)
        (c) 2012-2016  DirectFB integrated media GmbH
        (c) 2001-2016  The world wide DirectFB Open Source Community
        (c) 2000-2004  Convergence (integrated media) GmbH
      ----------------------------------------------------------------

(*) DirectFB/Core: Single Application Core. (2023-04-10 01:02)
(*) DRMKMS/System: Using device /dev/dri/card0 (default)
(*) DRMKMS/System: Found 1 connectors, 1 encoders, 3 planes
(*) DirectFB/Input: Hot-plug detection enabled with Linux Input
(*) DirectFB/Genefx: NEON enabled
(*) DirectFB/Graphics: Genefx Software Rasterizer 0.7 (DirectFB)
(*) DRMKMS/Screen: Default mode is 800x480 (1 modes in total)
(*) DRMKMS/Layer: Supported properties for layer id 36
(*)      zpos
(*) DRMKMS/Layer: Supported properties for layer id 41
(*)      alpha
(*)      zpos
(*) DirectFB/Core/WM: Default 0.3 (DirectFB)
(*) Direct/Interface: Loaded 'DGIFF' implementation of 'IDirectFBFont'
(*) Direct/Interface: Loaded 'DFIFF' implementation of 'IDirectFBImageProvider'
Benchmarking 256x256 on 800x459 RGB16 (16bit)...
Anti-aliased Text                              3.091 secs (   29.116 KChars/sec) [100.0%]
Anti-aliased Text (blend)                      3.155 secs (   20.538 KChars/sec) [100.0%]
Fill Rectangle                                 3.000 secs (  443.460 MPixel/sec) [100.0%]
Fill Rectangle (blend)                         3.330 secs (    7.872 MPixel/sec) [100.0%]
Fill Rectangles [10]                           3.018 secs (  456.015 MPixel/sec) [100.3%]
Fill Rectangles [10] (blend)                   8.257 secs (    7.937 MPixel/sec) [100.0%]
Fill Triangles                                 3.004 secs (  278.157 MPixel/sec) [100.3%]
Fill Triangles (blend)                         3.054 secs (    7.510 MPixel/sec) [100.0%]
Draw Rectangle                                 3.002 secs (   30.046 KRects/sec) [100.0%]
Draw Rectangle (blend)                         3.008 secs (    2.227 KRects/sec) [100.3%]
Draw Lines [10]                                3.008 secs (  104.720 KLines/sec) [100.0%]
Draw Lines [10] (blend)                        3.019 secs (    9.274 KLines/sec) [100.0%]
Fill Spans                                     3.009 secs (  413.819 MPixel/sec) [100.0%]
Fill Spans (blend)                             3.385 secs (    7.744 MPixel/sec) [100.0%]
Fill Trapezoids [10]                           3.104 secs (  380.041 MPixel/sec) [100.3%]
Blit                                           3.353 secs (   11.727 MPixel/sec) [100.0%]
Blit 180                                       4.365 secs (    4.504 MPixel/sec) [100.2%]
Blit colorkeyed                                3.381 secs (    5.815 MPixel/sec) [100.0%]
Blit with format conversion                    3.247 secs (    8.073 MPixel/sec) [100.0%]
Blit with colorizing                           3.475 secs (   11.315 MPixel/sec) [100.2%]
Blit from 32bit (blend)                        5.627 secs (    2.329 MPixel/sec) [100.0%]
Blit from 32bit (blend) with colorizing        3.533 secs (    3.709 MPixel/sec) [100.0%]
Blit SrcOver (premultiplied source)            3.291 secs (    3.982 MPixel/sec) [100.0%]
Blit SrcOver (premultiply source)              3.594 secs (    3.646 MPixel/sec) [100.2%]
Stretch Blit                                   7.160 secs (    4.487 MPixel/sec) [100.0%]
Stretch Blit colorkeyed                        7.234 secs (    4.441 MPixel/sec) [100.0%]
# df_dok

   ~~~~~~~~~~~~~~~~~~~~~~~~~~| DirectFB 2.0.0  |~~~~~~~~~~~~~~~~~~~~~~~~~~
        (c) 2017-2023  DirectFB2 Open Source Project (fork of DirectFB)
        (c) 2012-2016  DirectFB integrated media GmbH
        (c) 2001-2016  The world wide DirectFB Open Source Community
        (c) 2000-2004  Convergence (integrated media) GmbH
      ----------------------------------------------------------------

(*) DirectFB/Core: Single Application Core. (2023-04-09 22:23)
(*) FBDev/System: Using device /dev/fb0 (default)
(*) FBDev/System: Found 'sun4i-drmdrmfb' (ID 0) with framebuffer at 0x00000000, 1500k (MMIO 0x00000000, 0k)
(*) DirectFB/Input: Hot-plug detection enabled with Linux Input
(*) DirectFB/Genefx: NEON enabled
(*) DirectFB/Graphics: Generic Software Rasterizer 0.7 (DirectFB)
(*) FBDev/Screen: Default mode is 800x480 (0 modes in total)
(*) DirectFB/Core/WM: Default 0.3 (DirectFB)
(*) FBDev/Mode: Setting 800x480 RGB32
(*) FBDev/Mode: Switched to 800x480 (virtual 800x480) at 32 bits (RGB32), pitch 3200
(*) Direct/Interface: Loaded 'DGIFF' implementation of 'IDirectFBFont'
(*) Direct/Interface: Loaded 'DFIFF' implementation of 'IDirectFBImageProvider'
(*) FBDev/Mode: Setting 800x480 RGB32
(*) FBDev/Mode: Switched to 800x480 (virtual 800x480) at 32 bits (RGB32), pitch 3200
Benchmarking 256x256 on 800x459 RGB32 (32bit)...
Anti-aliased Text                              3.000 secs (   92.400 KChars/sec) [100.0%]
Anti-aliased Text (blend)                      3.159 secs (   20.512 KChars/sec) [100.3%]
Fill Rectangle                                 3.007 secs (  217.944 MPixel/sec) [100.3%]
Fill Rectangle (blend)                         3.511 secs (    3.733 MPixel/sec) [100.0%]
Fill Rectangles [10]                           3.254 secs (  221.541 MPixel/sec) [100.3%]
Fill Rectangles [10] (blend)                  17.525 secs (    3.739 MPixel/sec) [100.0%]
Fill Triangles                                 3.017 secs (  170.519 MPixel/sec) [100.3%]
Fill Triangles (blend)                         3.539 secs (    3.703 MPixel/sec) [100.2%]
Draw Rectangle                                 3.003 secs (   28.837 KRects/sec) [100.0%]
Draw Rectangle (blend)                         3.046 secs (    2.035 KRects/sec) [100.0%]
Draw Lines [10]                                3.006 secs (   99.800 KLines/sec) [100.3%]
Draw Lines [10] (blend)                        3.038 secs (    9.874 KLines/sec) [100.3%]
Fill Spans                                     3.006 secs (  252.900 MPixel/sec) [100.3%]
Fill Spans (blend)                             3.525 secs (    3.718 MPixel/sec) [100.0%]
Fill Trapezoids [10]                           3.218 secs (  203.654 MPixel/sec) [100.0%]
Blit                                           3.040 secs (  107.789 MPixel/sec) [100.0%]
Blit 180                                       3.001 secs (   96.087 MPixel/sec) [100.0%]
Blit colorkeyed                                3.049 secs (   90.275 MPixel/sec) [100.0%]
Blit with format conversion                    3.062 secs (   34.244 MPixel/sec) [100.0%]
Blit with colorizing                           3.283 secs (   19.962 MPixel/sec) [100.0%]
Blit from 32bit (blend)                        3.229 secs (    4.059 MPixel/sec) [100.3%]
Blit from 32bit (blend) with colorizing        4.110 secs (    3.189 MPixel/sec) [100.0%]
Blit SrcOver (premultiplied source)            3.600 secs (    7.281 MPixel/sec) [100.0%]
Blit SrcOver (premultiply source)              3.016 secs (   21.729 MPixel/sec) [100.0%]
Stretch Blit                                   3.000 secs (  112.313 MPixel/sec) [100.0%]
Stretch Blit colorkeyed                        3.007 secs (   82.687 MPixel/sec) [100.0%]

Blit performance and alpha seems to really struggle in DRM/KMS.

Fbdev is a huge performance increase on DirectFB(1.7)

Let me know if you would like any more tests run, I can build and test quite quickly now.

Also massive shoutout to @fifteenhex for the Buildroot packages for DFB2 - I would highly recommend switching to official sources and committing them to the buildroot repository.

caramelli commented 1 year ago

Interesting, these differences are consistent with results I get.

I guess we can get good performance if the drmkms system module is combined with a GFX driver module for hardware acceleration. But comparing to the fbdev system module like you did, I'm also convinced that this drmkms implementation could be improved when no GFX driver module is available. It's on my TODO list, and any help making it is welcome.

Note that if you can access a 3D GPU via OpenGL ES, the DirectFB2-eglgbm system module combined with the DirectFB2-gles2 GFX driver provide great performance with the df_andi and df_dok examples.

kjngineering commented 1 year ago

@caramelli

But comparing to the fbdev system module like you did, I'm also convinced that this drmkms implementation could be ? improved when no GFX driver module is available. It's on my TODO list, and any help making it is welcome.

Well in this test case the implementation as I understand is actually a FBDEV emulation shim (DRM_FBDEV_EMULATUON) provided by DRM, so I had expected DRM to be quicker on the basis that it is theoretically a shorter data pathway/less data manipulation.

/dev/fb0 ----> [emulation shim] ----> /dev/card0 ----> [display pipeline/mixer] ----> panel etc.

Either way the fbdev is pretty quick regardless, so its at least a means to an end. At the moment its the only practical option to use SDL2 in small unaccelerated devices without running a full mesa stack with the software rateriser option enabled, since they depreciated a fbdev endpoint.

This particular processor doesn't have any GPU, but I have another devkit coming which has a Mali 400 MP2 (LIMA) which I will test in due course and try and provide some results.

fifteenhex commented 1 year ago

Also massive shoutout to @fifteenhex for the Buildroot packages for DFB2 - I would highly recommend switching to official sources and committing them to the buildroot repository.

Yeah, that would make my life easier as getting SDL/SDL2 to link against DFB2 with DFB2 being external is a pain. If you want to have a go submitting to buildroot be my guest.. If I do it we're talking a few months until I have time.

BTW.. I think most Allwinner chips have a 2D blitter? You could write a driver for that like I did for the MStar/Sigmastar GE and that should help a lot.

Also from what I remember of the code there seems to be code paths for handling stereo displays etc. I think there might be some gains in making all of that stuff configurable or maybe just dropping it as I doubt anyone will step up to test it works.

kjngineering commented 1 year ago

BTW.. I think most Allwinner chips have a 2D blitter? You could write a driver for that like I did for the MStar/Sigmastar GE and that should help a lot.

The Display Engine 2.0 in my test chip has some mixer support in mainline (but not the full capability of the peripheral), based on the mode test above it supports the primary plane, plus two overlays. I assume these are the hardware blit layers? Would love to give a crack at expanding the driver, but I am most definitely hardware guy, not software.

kjngineering commented 1 year ago

Resolved with pull #109 .

Issue #110 created regarding DRM/KMS v FBDev performance issues.