anholt / linux

Other
134 stars 24 forks source link

vc4 and slow screen updates at higher resolution videos/streams #136

Open nkichukov opened 6 years ago

nkichukov commented 6 years ago

Hello, When using the vc4 opensource driver on raspberry pi 3B and playing a movie at anything above 720x576 (Video: h264 (Main) ([27][0][0][0] / 0x001B), yuv420p, 720x576 [SAR 16:11 DAR 20:11], 50 fps, 50 tbr, 90k tbn, 100 tbc) of Input #0, hls,applehttp stream the screen is slow to refresh thus causing slow motion effect. 720p is even worse, not to mention 1080p videos. The player is kodi 17.6 running on xorg server.

At 720p, the 4 CPU cores are at 75-80% utilization.

config.txt is configured with: dtoverlay=vc4-kms-v3d,cma-256

Thank you! -N

nkichukov commented 6 years ago

exporting VC4_DEBUG=perf and then starting kodi shows:

... Fallback conversion for 4 PIPE_PRIM_QUADS vertices Fallback conversion for 28 PIPE_PRIM_QUADS vertices Fallback conversion for 52 PIPE_PRIM_QUADS vertices Fallback conversion for 48 PIPE_PRIM_QUADS vertices Fallback conversion for 472 PIPE_PRIM_QUADS vertices Fallback conversion for 472 PIPE_PRIM_QUADS vertices ...

The FPS of the 720p video with 50FPS on 1080p TV screen resolution is 12fps as reported by Kodi debug information.

glxgears -info -fullscreen reports 60FPS: xinit /usr/bin/glxgears -- :0 -nolisten tcp vt7 ... 307 frames in 5.0 seconds = 61.209 FPS 300 frames in 5.0 seconds = 60.000 FPS 301 frames in 5.0 seconds = 60.000 FPS ^Cxinit: connection to X server lost

glxinfo: direct rendering: Yes server glx vendor string: SGI server glx version string: 1.4 ... Extended renderer info (GLX_MESA_query_renderer): Vendor: Broadcom (0x14e4) Device: VC4 V3D 2.1 (0xffffffff) Version: 17.3.9 Accelerated: yes Video memory: 977MB Unified memory: yes Preferred profile: compat (0x2) Max core profile version: 0.0 Max compat profile version: 2.1 Max GLES1 profile version: 1.1 Max GLES[23] profile version: 2.0 OpenGL vendor string: Broadcom OpenGL renderer string: VC4 V3D 2.1 OpenGL version string: 2.1 Mesa 17.3.9 OpenGL shading language version string: 1.20 ... OpenGL ES profile version string: OpenGL ES 2.0 Mesa 17.3.9 OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16

720p video at 50FPS available here: https://kichukov.home.xs4all.nl/2min.mp4 (2 minute duration)

Kodi: 01:53:48.363 T:1941131280 INFO: GL: Maximum texture width: 2048 01:53:48.363 T:1941131280 DEBUG: GLX_EXTENSIONS: GLX_ARB_create_context GLX_ARB_create_context_profile GLX_ARB_fbconfig_float GLX_ARB_framebuffer_sRGB GLX_A RB_get_proc_address GLX_ARB_multisample GLX_EXT_buffer_age GLX_EXT_create_context_es2_profile GLX_EXT_create_context_es_profile GLX_EXT_fbconfig_packed_float GLX_EXT_framebuffer_sRGB GLX_EXT_import_context GLX_EXT_texture_from_pixmap GLX_EXT_visual_info GLX_EXT_visual_rating GLX_INTEL_swap_event GLX_MESA_copy_sub_b uffer GLX_MESA_multithread_makecurrent GLX_MESA_query_renderer GLX_MESA_swap_control GLX_OML_swap_method GLX_OML_sync_control GLX_SGIS_multisample GLX_SGIX_fb config GLX_SGIX_pbuffer GLX_SGIX_visual_select_group GLX_SGI_make_current_read GLX_SGI_swap_control GLX_SGI_video_sync 01:53:48.363 T:1941131280 NOTICE: GL_VENDOR = Broadcom 01:53:48.363 T:1941131280 NOTICE: GL_RENDERER = VC4 V3D 2.1 01:53:48.363 T:1941131280 NOTICE: GL_VERSION = 2.1 Mesa 17.3.9 01:53:48.363 T:1941131280 NOTICE: GL_SHADING_LANGUAGE_VERSION = 1.20 01:53:48.363 T:1941131280 NOTICE: GL_EXTENSIONS = GL_ARB_multisample GL_EXT_abgr GL_EXT_bgra GL_EXT_blend_color GL_EXT_blend_minmax GL_EXT_blendsubtract GL EXT_copy_texture GL_EXT_polygon_offset GL_EXT_subtexture GL_EXT_texture_object GL_EXT_vertex_array GL_EXT_compiled_vertex_array GL_EXT_texture GL_EXT_texture3 D GL_IBM_rasterpos_clip GL_ARB_point_parameters GL_EXT_draw_range_elements GL_EXT_packed_pixels GL_EXT_point_parameters GL_EXT_rescale_normal GL_EXTseparate specular_color GL_EXT_texture_edge_clamp GL_SGIS_generate_mipmap GL_SGIS_texture_border_clamp GL_SGIS_texture_edge_clamp GL_SGIS_texture_lod GL_ARB_framebuffe r_sRGB GL_ARB_multitexture GL_EXT_framebuffer_sRGB GL_IBM_multimode_draw_arrays GL_IBM_texture_mirrored_repeat GL_ARB_texture_cube_map GL_ARB_texture_env_add GL_ARB_transpose_matrix GL_EXT_blend_func_separate GL_EXT_fog_coord GL_EXT_multi_draw_arrays GL_EXT_secondary_color GL_EXT_texture_env_add GL_EXT_texturelod bias GL_INGR_blend_func_separate GL_NV_blend_square GL_NV_light_max_exponent GL_NV_texgen_reflection GL_NV_texture_env_combine4 GL_SUN_multi_draw_arrays GL_AR B_texture_border_clamp GL_ARB_texture_compression GL_EXT_framebuffer_object GL_EXT_texture_env_combine GL_EXT_texture_env_dot3 GL_MESA_window_pos GL_NV_packed _depth_stencil GL_NV_texture_rectangle GL_ARB_depth_texture GL_ARB_occlusion_query GL_ARB_shadow GL_ARB_texture_env_combine GL_ARB_texture_env_crossbar GL_ARB _texture_env_dot3 GL_ARB_texture_mirrored_repeat GL_ARB_window_pos GL_ATI_fragment_shader GL_EXT_stencil_two_side GL_EXT_texture_cube_map GL_NV_fog_distance G L_APPLE_packed_pixels GL_ARB_draw_buffers GL_ARB_fragment_program GL_ARB_fragment_shader GL_ARB_shader_objects GL_ARB_vertex_program GL_ARB_vertex_shader GL_A TI_draw_buffers GL_ATI_texture_env_combine3 GL_EXT_shadow_funcs GL_EXT_stencil_wrap GL_MESA_pack_invert GL_ARB_fragment_program_shadow GL_ARB_half_float_pixel GL_ARB_occlusion_query2 GL_ARB_point_sprite GL_ARB_shading_language_100 GL_ARB_sync GL_ARB_texture_non_power_of_two GL_ARB_vertex_buffer_object GL_ATI_blend_equation_separate GL_EXT_blend_equation_separate GL_OES_read_format GL_ARB_color_buffer_float GL_ARB_pixel_buffer_object GL_ARB_texture_rectangle GL_EXT_pixel_buffer_object GL_EXT_texture_rectangle GL_EXT_texture_sRGB GL_ARB_framebuffer_object GL_EXT_framebuffer_blit GL_EXT_framebuffer_multisample GL_EXT_packed_depth_stencil GL_ARB_vertex_array_object GL_ATI_separate_stencil GL_EXT_gpu_program_parameters GL_EXT_texture_sRGB_decode GL_OES_EGL_image GL_ARB_copy_buffer GL_ARB_half_float_vertex GL_ARB_map_buffer_range GL_ARB_texture_swizzle GL_EXT_texture_swizzle GL_ARB_ES2_compatibility GL_ARB_debug_output GL_ARB_draw_elements_base_vertex GL_ARB_explicit_attrib_location GL_ARB_fragment_coord_conventions GL_ARB_provoking_vertex GL_ARB_sampler_objects GL_ARB_texture_multisample GL_EXT_provoking_vertex GL_NV_texture_barrier GL_ARB_get_program_binary GL_ARB_robustness GL_ARB_separate_shader_objects GL_ARB_compressed_texture_pixel_storage GL_ARB_internalformat_query GL_ARB_map_buffer_alignment GL_ARB_texture_storage GL_EXT_framebuffer_multisample_blit_scaled GL_AMD_shader_trinary_minmax GL_ARB_clear_buffer_object GL_ARB_explicit_uniform_location GL_ARB_invalidate_subdata GL_ARB_program_interface_query GL_ARB_texture_storage_multisample GL_ARB_vertex_attrib_binding GL_KHR_debug GL_ARB_buffer_storage GL_ARB_internalformat_query2 GL_ARB_multi_bind GL_EXT_shader_integer_mix GL_ARB_get_texture_sub_image GL_ARB_texture_barrier GL_KHR_context_flush_control GL_KHR_no_error GL_MESA_tile_raster_order 01:53:48.364 T:1941131280 INFO: GL: Maximum texture width: 2048 01:53:48.732 T:1941131280 INFO: GL: Enabling VSYNC

same issues with frame drops and high CPU utilization are observed with MPV:

mpv 2min.mp4 -vo=opengl Playing: 2min.mp4 (+) Video --vid=1 () (h264 1280x720 50.000fps) (+) Audio --aid=1 --alang=und () (aac 2ch 48000Hz) [vo/opengl] VT_GETMODE failed: Inappropriate ioctl for device [vo/opengl] Failed to set up VT switcher. Terminal switching will be unavailable. AO: [alsa] 48000Hz stereo 2ch float VO: [opengl] 1280x720 yuv420p AV: 00:00:24 / 00:02:00 (20%) A-V: 0.000 Dropped: 722

For 24 seconds it had to drop 722 frames.

System-wide mesa driver in use:

eselect mesa show sw gallium

eselect mesa list i915 (Intel 915, 945) i965 (Intel GMA 965, G/Q3x, G/Q4x, HD) r300 (Radeon R300-R500) r600 (Radeon R600-R700, Evergreen, Northern Islands) sw (Software renderer) [1] classic [2] gallium *

mesa is compiled with support for: classic, dri3, egl, gallium, gbm, gles1, gles2, llvm and vc4 video card.

It seems like the h264 decoding is not offloaded to the GPU, thus the CPU overload and high frame drop rate. Kernel in use: 4.16.10 on 32bit Gentoo.

Thanks, -N

nkichukov commented 6 years ago

Right, seems related to: https://github.com/anholt/linux/issues/13

Once this code makes into the raspberrypi-sources I will be eager to test again.