ardera / flutter-pi

A light-weight Flutter Engine Embedder for Linux Embedded that runs without X11 or Wayland.
MIT License
1.63k stars 162 forks source link

Release mode app is slow and simple button touch is sluggish #125

Open GHCRajan opened 3 years ago

GHCRajan commented 3 years ago

Hello,

First of all, sincere appreciation for your efforts with this flutter-pi renderer. Out of the various renderers I have tried on my AM4372 GP EVM (ARMv7, 1GHz, PVR SGX GPU, 1GB RAM, 7" Display), this one worked like a charm.

Able to build and run apps in both DEBUG & RELEASE modes (locally created as well as particle_background, gallery etc). For the past few days, I could not make progress in improving the application performance.

Default button application takes more than 1.3 seconds between touch event and display update with counter. Added trace messages to flutter-pi and observe the following. { [event #1605890907099] count: 1, evt0: kind: 2, phase: 2 [event #1605890907137] count: 1, evt0: kind: 2, phase: 1 [present in #1605890908313] count: 1 [present out #1605890908319] [present in #1605890908334] count: 1 [present out #1605890908340] }

Touch and GPU appears to be instantaneous (glmark2_es2_wayland score around 50). Where as flutter engine appears to be taking more time. (CPU load > 40% for simple touch). Have tried various options (release engine from your repo, locally built one, forcing processor at 1GHz from userland than on demand, faster uSD card etc) without much improvement. Engine is built and running in release more. (No DEBUG ribbon on the GUI, binary size around 7MB etc). No errors in generating app.so / bundle.

Looking forward for your expert suggestion to analyze the problematic area and get this running smoothly. Let me know if more details are needed.

Thanks

ardera commented 3 years ago

glmark2_es2_wayland score around 50

50 is actually very low, though you need to make sure you're not limited by vsync. If your FPS and FrameTime are the same for most tests, you probably are. I recommend using glmark2-es2-drm with the --off-screen argument.

The glmark2-es-drm score with --off-screen for my Pi 4 is 461.

PVR SGX GPU

PowerVR has a wide range of GPUs. Without a specific model name, it's hard to tell how capable your GPU is. I googled a bit, and found the datasheet for AM437x Sitara™ Processors (Rev. E) and, at least in that datasheet, it says the GPU is a PowerVR SGX 530, which was released in 2005. Your specific board / chip may be different though.

Can you give me the complete output of flutter-pi? I'm especially interested in the EGL / GLES info that's printed out.

You can also enable the performance overlay in flutter. The easiest way to do that is by passing showPerformanceOverlay: true to the MaterialApp or WidgetsApp constructor.

GHCRajan commented 3 years ago

Thank you for your reply.

Please find below the output of flutter-pi around EGL / GLES info.

EGL information: version: 1.5 vendor: "Mesa Project" client extensions: "EGL_EXT_client_extensions EGL_EXT_device_base EGL_EXT_device_enumeration EGL_EXT_device_query EGL_EXT_platform_base EGL_KHR_client_get_all_proc_addresses EGL_KHR_debug EGL_EXT_platform_wayland EGL_MESA_platform_gbm EGL_MESA_platform_surfaceless" display extensions: "EGL_EXT_buffer_age EGL_EXT_create_context_robustness EGL_EXT_image_dma_buf_import EGL_IMG_cl_image EGL_KHR_config_attribs EGL_KHR_create_context EGL_KHR_fence_sync EGL_KHR_get_all_proc_addresses EGL_KHR_gl_renderbuffer_image EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_image EGL_KHR_image_base EGL_KHR_image_pixmap EGL_KHR_no_config_context EGL_KHR_reusable_sync EGL_KHR_surfaceless_context EGL_EXT_pixel_format_float EGL_KHR_wait_sync EGL_MESA_configless_context EGL_MESA_drm_image EGL_WL_bind_wayland_display "

OpenGL ES information: version: "OpenGL ES 2.0 build 1.17@4948957" shading language version: "OpenGL ES GLSL ES 1.00 build 1.17@4948957" vendor: "Imagination Technologies" renderer: "PowerVR SGX 530" extensions: "GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth24 GL_OES_depth_texture GL_OES_egl_sync GL_OES_element_index_uint GL_OES_EGL_image GL_OES_EGL_image_external GL_OES_fbo_render_mipmap GL_OES_fragment_precision_high GL_OES_get_program_binary GL_OES_mapbuffer GL_OES_packed_depth_stencil GL_OES_required_internalformat GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_surfaceless_context GL_OES_texture_float GL_OES_texture_half_float GL_OES_vertex_array_object GL_OES_vertex_half_float GL_EXT_blend_minmax GL_EXT_discard_framebuffer GL_EXT_multi_draw_arrays GL_EXT_multisampled_render_to_texture GL_EXT_shader_texture_lod GL_EXT_texture_format_BGRA8888 GL_EXT_texture_rg GL_IMG_multisampled_render_to_texture GL_IMG_program_binary GL_IMG_read_format GL_IMG_shader_binary GL_IMG_texture_compression_pvrtc GL_IMG_texture_format_BGRA8888 GL_IMG_texture_npot GL_IMG_uniform_buffer_object GL_KHR_debug GL_EXT_texture_storage"

Thank you for your suggestion, I have enabled the performance overlay with the default application. Please find the numbers after few button touches.

Raster max: 1480 ms/frame, avg 213.1 ms/frame UI max: 298 ms/frame, avg 5 ms/frame

The above values are based on the engine binaries 1.22.4.

When I ran the command, glmark2-es2-drm --off-screen, the final score is glmark2 Score: 326 after ignoring the repeated messages PVR:(Error): SGXKickTA: TA went out of Mem and SPM occurred during last TA kick [0, ]

Though the scores are comparable (326 versus 461), raster thread seems to be taking more time. Is there any suggestion to improve this raster thread?

Thanks

ardera commented 3 years ago

The raster thread time is basically GPU time, so the time it takes for the GPU to render the image. UI thread time is the time spent inside the dart VM.

Though the scores are comparable (326 versus 461), raster thread seems to be taking more time. Is there any suggestion to improve this raster thread?

I agree, the raster thread time seems to be way higher than what you'd expect given the glmark score. There are some tools from PowerVR you can use to profile it, maybe you can find out what takes the GPU so long: https://www.imaginationtech.com/developers/powervr-sdk-tools/pvrtune/ You probably need to install PVRPerfServer on your AM4372 board and PVRTune GUI on your development machine.

There's also PVRTune Complete, which sadly is only available under NDA.

GHCRajan commented 3 years ago

Thank you for your reply. Will reach out to the vendor for any possible support around this.

In the meantime, is there any possibility to configure this renderer as 30 fps or less ? In otherwords, Can the flutter-pi renderer reports a "slow" GPU to engine and expects lower FPS ?

Thanks

ardera commented 3 years ago

In the meantime, is there any possibility to configure this renderer as 30 fps or less ? In otherwords, Can the flutter-pi renderer reports a "slow" GPU to engine and expects lower FPS ?

It can't report a slow GPU, but it can somewhat emulate a slow screen and only actually report vsync to the engine every 2nd vblank or such. Why would that be helpful though? Flutter (IIRC) will just skip any frames it didn't have time to render, so emulating a slow screen wouldn't change anything. It'd just introduce additional latency