VirtualGL / virtualgl

Main VirtualGL repository
https://VirtualGL.org
Other
687 stars 103 forks source link

Latest Google Chrome 112 not working with EGL #229

Closed m1k1o closed 7 months ago

m1k1o commented 1 year ago

Hi,

google chrome 111.0.5563.146 works with VirtualGL 3.1. In the same environment, when upgraded to 112.0.5615.49 it does not work anymore.

Here is dockerfile with full environment, and here installation of google chrome.

This is log error. Usually open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2) are not fatal and were displayed even for earlier versions.

[88:88:0407/174925.830625:ERROR:vaapi_wrapper.cc(843)] Could not get a valid VA display
[88:88:0407/174925.830995:FATAL:gpu_init.cc(542)] Passthrough is not supported, GL is egl, ANGLE is 
[0407/174925.843119:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
[0407/174925.843187:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2)
[14:14:0407/174926.103314:ERROR:gpu_process_host.cc(942)] GPU process exited unexpectedly: exit_code=133
[320:320:0407/174926.228027:ERROR:vaapi_wrapper.cc(843)] Could not get a valid VA display
[320:320:0407/174926.228198:FATAL:gpu_init.cc(542)] Passthrough is not supported, GL is egl, ANGLE is 
[0407/174926.238762:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
[0407/174926.238802:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2)
[14:14:0407/174926.452258:ERROR:gpu_process_host.cc(942)] GPU process exited unexpectedly: exit_code=133
[415:415:0407/174926.579138:ERROR:vaapi_wrapper.cc(843)] Could not get a valid VA display
[415:415:0407/174926.579314:FATAL:gpu_init.cc(542)] Passthrough is not supported, GL is egl, ANGLE is 
[0407/174926.591117:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
[0407/174926.591153:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2)
[14:14:0407/174926.809999:ERROR:gpu_process_host.cc(942)] GPU process exited unexpectedly: exit_code=133
[434:434:0407/174926.940594:ERROR:vaapi_wrapper.cc(843)] Could not get a valid VA display
[434:434:0407/174926.940802:FATAL:gpu_init.cc(542)] Passthrough is not supported, GL is egl, ANGLE is 
[0407/174926.950920:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
[0407/174926.950963:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2)
[14:14:0407/174927.165950:ERROR:gpu_process_host.cc(942)] GPU process exited unexpectedly: exit_code=133
[436:436:0407/174927.379391:ERROR:vaapi_wrapper.cc(843)] Could not get a valid VA display
[436:436:0407/174927.379623:FATAL:gpu_init.cc(542)] Passthrough is not supported, GL is egl, ANGLE is 
[0407/174927.389678:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
[0407/174927.389717:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2)
[14:14:0407/174927.604582:ERROR:gpu_process_host.cc(942)] GPU process exited unexpectedly: exit_code=133
[444:444:0407/174927.812792:ERROR:vaapi_wrapper.cc(843)] Could not get a valid VA display
[444:444:0407/174927.813030:FATAL:gpu_init.cc(542)] Passthrough is not supported, GL is egl, ANGLE is 
[0407/174927.825447:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
[0407/174927.825517:ERROR:file_io_posix.cc(144)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2)
[14:14:0407/174928.048472:ERROR:gpu_process_host.cc(942)] GPU process exited unexpectedly: exit_code=133

Related issue: https://github.com/m1k1o/neko/issues/279

dcommander commented 1 year ago

Are you using the Chrome application recipe listed in the VirtualGL User's Guide?

m1k1o commented 1 year ago

Yes, using --disable-seccomp-filter-sandbox --use-gl=egl here.

image

Is there something more that I might be missing?

dcommander commented 1 year ago

I can reproduce the failure, but I have no clue why it is occurring. The proximal cause is that Chromium appears to pass EGL_ROBUST_RESOURCE_INITIALIZATION_ANGLE in the eglCreateContext() attributes array without first checking whether the EGL_ANGLE_robust_resource_initialization extension is available. That is a clear application bug. I can hack around that bug in VirtualGL by filtering out the errant attribute:

--- a/server/faker-egl.cpp
+++ b/server/faker-egl.cpp
@@ -359,8 +359,23 @@ EGLBoolean eglCopyBuffers(EGLDisplay display, EGLSurface surface,
 EGLContext eglCreateContext(EGLDisplay display, EGLConfig config,
   EGLContext share_context, const EGLint *attrib_list)
 {
+  EGLint attribs[MAX_ATTRIBS + 1], j = 0;
+
+  if(attrib_list)
+  {
+    for(int i = 0; attrib_list[i] != EGL_NONE && i < MAX_ATTRIBS - 2; i += 2)
+    {
+      if(attrib_list[i] != 0x3453)
+      {
+        attribs[j++] = attrib_list[i];  attribs[j++] = attrib_list[i + 1];
+        vglout.println("0x%.4x=0x%.4x", attrib_list[i], attrib_list[i + 1]);
+      }
+    }
+  }
+  attribs[j] = EGL_NONE;
+
   WRAP_DISPLAY_INIT(EGL_NOT_INITIALIZED);
-  return _eglCreateContext(display, config, share_context, attrib_list);
+  return _eglCreateContext(display, config, share_context, attribs);
   bailout:
   return EGL_NO_CONTEXT;
 }

but hardware acceleration is still not enabled in Chrome. I spent many hours of uncompensated labor prior to VGL 3.1 figuring out how to accommodate Chrome's quirks with VirtualGL, and it's really disheartening that Google immediately invalidated all of my work. Really wish they would play more nicely with the open source community and stop expecting me to spend my free time cleaning up their messes here and in my other open source projects. (I say this in the context of also having spent about two days of uncompensated labor this week fixing a bug in libjpeg-turbo that was introduced by a feature Google submitted many years ago in support of Android, a feature that has already required me to spend many hours of uncompensated labor on probably half a dozen occasions over the years.) I'm done with this mess. If someone wants to pay for my time to track down all of this Chrome BS and file appropriate bug reports, then fine. Otherwise, it's not my job.

mbattista commented 1 year ago

Hi,

I looked into the git diff between 111 and 112 and I want to point out a change, that could be the root of this. https://chromium.googlesource.com/chromium/src/+blame/refs/tags/112.0.5615.134/gpu/ipc/service/gpu_init.cc#547 They introduced the "feature", that passthrough no longer gets determined by the dedicated function but if gl is enabled, passthrough is enabled.

One if the checks, if passthrough is available is a check for the robust initialization. https://chromium.googlesource.com/chromium/src/+/refs/tags/112.0.5615.134/ui/gl/gl_utils.cc#110 This probably then leads to the expectation, that this extension is available.

It seems that this is the new intended behavior, since the ANGLE support is widespread enough for googles intentions. https://bugs.chromium.org/p/chromium/issues/detail?id=976283 https://bugs.chromium.org/p/chromium/issues/detail?id=1406585

It would be interessting to see if disabling the "true" on default would bring back the old behavior or if even more things have to be changed.

Would virtualising ANGLE in a similar fashion as it has been done with egl be possible?

dcommander commented 1 year ago

ANGLE is supposed to basically translate OpenGL ES calls on the front end to various types of back-end calls, including desktop OpenGL, Direct X, and Vulkan. Thus, I'm not sure if there is an actual ANGLE API to interpose or if it would make sense to do so rather than to interpose the lower-level API, as VirtualGL already does. However, as previously mentioned, Chrome's OpenGL ES interface (which I presume is related to ANGLE somehow) does something really stupid vis-a-vis unnecessarily enforcing a 1:1 relationship between EGLConfigs and X visuals. That's why you have to pass --use-gl=egl to make prior versions of Chrome work with VirtualGL. It may be that 112 would work fine with VGL if it weren't for that issue.

Regardless, unless someone wants to do most of the legwork to figure out how to solve the problem, or unless someone wants to pay me to do that legwork, the chances of it happening any time soon are slim. I've spent a great deal of unpaid hours on this problem already, and Google just instantly invalidated all of that work. If they aren't going to at least try to play nice, then I see no compelling reason to spend my free time supporting their application.

greg321321 commented 1 year ago

I verified also that VirtualGL 3.1 + google chrome version 113 + TurboVNC does not work ok.

I can see that the gpu is used (verified with intel_gpu_top utility from gpu manufacturer) when I run: export DISPLAY=:101 /opt/TurboVNC/bin/vncserver $DISPLAY -geometry 1920x1080 vglrun -d egl /opt/VirtualGL/bin/eglxspheres64

So I think I have the basic VNC+VirtualGL+GPU card setup ok. I will revert google chrome to v 111, enable vglrun +tr, and try again. I saw some comments on some other post that was experimenting with an vglrun angle option?

Setup: ubuntu 20.04 VirtualGL 3.1 google chrome 113 hardware: intel uhd graphics 630 - i915 driver Turbo VNC 3.0.3

Error details: root@216:/opt/google/chrome# vglrun -d egl ./chrome --disable-seccomp-filter-sandbox --use-gl=egl --no-sandbox [10580:10580:0520/111632.563579:FATAL:gpu_init.cc(497)] Passthrough is not supported, GL is egl, ANGLE is [10540:10540:0520/111632.653508:ERROR:gpu_process_host.cc(953)] GPU process exited unexpectedly: exit_code=133 [10694:10694:0520/111632.699338:FATAL:gpu_init.cc(497)] Passthrough is not supported, GL is egl, ANGLE is [10540:10540:0520/111632.780520:ERROR:gpu_process_host.cc(953)] GPU process exited unexpectedly: exit_code=133 [10709:10709:0520/111632.825763:FATAL:gpu_init.cc(497)] Passthrough is not supported, GL is egl, ANGLE is [10540:10540:0520/111632.905103:ERROR:gpu_process_host.cc(953)] GPU process exited unexpectedly: exit_code=133 [10724:10724:0520/111632.947800:ERROR:gl_display.cc(504)] EGL Driver message (Error) eglCreateContext: eglCreateContext [10724:10724:0520/111632.947902:ERROR:gl_context_egl.cc(383)] eglCreateContext failed with error EGL_SUCCESS [10724:10724:0520/111632.948409:FATAL:gpu_init.cc(497)] Passthrough is not supported, GL is angle, ANGLE is swiftshader [10540:10540:0520/111633.027215:ERROR:gpu_process_host.cc(953)] GPU process exited unexpectedly: exit_code=133 [10739:10739:0520/111633.070309:ERROR:gl_display.cc(504)] EGL Driver message (Error) eglCreateContext: eglCreateContext [10739:10739:0520/111633.070409:ERROR:gl_context_egl.cc(383)] eglCreateContext failed with error EGL_SUCCESS [10739:10739:0520/111633.071029:FATAL:gpu_init.cc(497)] Passthrough is not supported, GL is angle, ANGLE is swiftshader [10540:10540:0520/111633.149538:ERROR:gpu_process_host.cc(953)] GPU process exited unexpectedly: exit_code=133 [10754:10754:0520/111633.192040:ERROR:gl_display.cc(504)] EGL Driver message (Error) eglCreateContext: eglCreateContext [10754:10754:0520/111633.192139:ERROR:gl_context_egl.cc(383)] eglCreateContext failed with error EGL_SUCCESS [10754:10754:0520/111633.192967:FATAL:gpu_init.cc(497)] Passthrough is not supported, GL is angle, ANGLE is swiftshader [10540:10540:0520/111633.272997:ERROR:gpu_process_host.cc(953)] GPU process exited unexpectedly: exit_code=133 [10540:10564:0520/111633.393787:ERROR:object_proxy.cc(623)] Failed to call method: org.freedesktop.DBus.StartServiceByName: object_path= /org/freedesktop/DBus: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

greg321321 commented 1 year ago

Reverting to Google Chrome v 111 works a little better. I am now able to use chrome with a 3d game (us.forgeofempires.com), where previously game would not load. When game is in browser, it makes chrome/gameplay super laggy and slow. I will look into it a bit more. It is encouraging to see though that the CPU is hardly used to run the game, when previously the game was taking 95% of cpu to run!

I suggest at a minimum can the VirtualGL application recipe for google chrome have a note attached? Something simple like 'use version 111 of chrome if having problems with EGL mode'

Details: Helpful instructions for reverting chrome on ubuntu:

install chrome

instructions: https://makandracards.com/makandra/486433-how-to-downgrade-google-chrome-in-ubuntu

VERSION_STRING="111.0.5563.64-1" # Replace this value with the one you copied earlier

wget "https://dl.google.com/linux/chrome/deb/pool/main/g/google-chrome-stable/google-chrome-stable_${VERSION_STRING}_amd64.deb"

dpkg -i "google-chrome-stable_${VERSION_STRING}_amd64.deb"

Command output: root@216:/opt/google/chrome# vglrun -d egl ./chrome --disable-seccomp-filter-sandbox --use-gl=egl --no-sandbox [12129:12129:0520/113018.067313:ERROR:gpu_init.cc(525)] Passthrough is not supported, GL is egl, ANGLE is [12129:12129:0520/113018.074374:ERROR:gpu_memory_buffer_support_x11.cc(49)] dri3 extension not supported. [12087:12111:0520/113018.894573:ERROR:object_proxy.cc(623)] Failed to call method: org.freedesktop.DBus.StartServiceByName: object_path= /org/freedesktop/DBus: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. INFO: Created TensorFlow Lite XNNPACK delegate for CPU. [12087:12087:0520/113041.779784:ERROR:gpu_process_host.cc(952)] GPU process exited unexpectedly: exit_code=134 [12346:12346:0520/113041.847830:ERROR:gpu_init.cc(525)] Passthrough is not supported, GL is egl, ANGLE is [12346:12346:0520/113041.853172:ERROR:gpu_memory_buffer_support_x11.cc(49)] dri3 extension not supported. [12329:12339:0520/113100.076883:ERROR:command_buffer_proxy_impl.cc(128)] ContextResult::kTransientFailure: Failed to send GpuControl.CreateCommandBuffer. [12087:12087:0520/113224.991446:ERROR:page_load_metrics_update_dispatcher.cc(178)] Invalid first_paint (unset) for first_image_paint 0.899 s [12087:12087:0520/113225.074695:ERROR:page_load_metrics_update_dispatcher.cc(178)] Invalid first_paint (unset) for first_image_paint 0.899 s [12087:12087:0520/113225.586749:ERROR:page_load_metrics_update_dispatcher.cc(178)] Invalid first_paint (unset) for first_image_paint 0.899 s [12087:12087:0520/113225.794688:ERROR:page_load_metrics_update_dispatcher.cc(178)] Invalid first_paint (unset) for first_image_paint 0.899 s [12087:12087:0520/113226.066635:ERROR:page_load_metrics_update_dispatcher.cc(178)] Invalid first_paint (unset) for first_image_paint 0.899 s [12087:12087:0520/113229.362638:ERROR:page_load_metrics_update_dispatcher.cc(178)] Invalid first_paint (unset) for first_image_paint 0.899 s

my chrome v111 chrome://gpu output: Graphics Feature Status Canvas: Hardware accelerated Canvas out-of-process rasterization: Disabled Direct Rendering Display Compositor: Disabled Compositing: Hardware accelerated Multiple Raster Threads: Enabled OpenGL: Enabled Rasterization: Hardware accelerated Raw Draw: Disabled Video Decode: Hardware accelerated Video Encode: Software only. Hardware acceleration disabled Vulkan: Disabled WebGL: Hardware accelerated WebGL2: Hardware accelerated WebGPU: Disabled Driver Bug Workarounds adjust_src_dst_region_for_blitframebuffer clear_uniforms_before_first_program_use count_all_in_varyings_packing enable_webgl_timer_query_extensions exit_on_context_lost msaa_is_slow msaa_is_slow_2 rely_on_implicit_sync_for_swap_buffers disabled_extension_GL_KHR_blend_equation_advanced disabled_extension_GL_KHR_blend_equation_advanced_coherent Problems Detected WebGPU has been disabled via blocklist or the command line. Disabled Features: webgpu Accelerated video encode has been disabled, either via blocklist, about:flags or the command line. Disabled Features: video_encode Clear uniforms before first program use on all platforms: 124764, 349137 Applied Workarounds: clear_uniforms_before_first_program_use Mesa drivers in Linux handle varyings without static use incorrectly: 333885 Applied Workarounds: count_all_in_varyings_packing On Intel GPUs MSAA performance is not acceptable for GPU rasterization: 527565, 1298585 Applied Workarounds: msaa_is_slow adjust src/dst region if blitting pixels outside framebuffer on Linux Intel: 664740 Applied Workarounds: adjust_src_dst_region_for_blitframebuffer Disable KHR_blend_equation_advanced until cc shaders are updated: 661715 Applied Workarounds: disable(GL_KHR_blend_equation_advanced), disable(GL_KHR_blend_equation_advanced_coherent) Expose WebGL's disjoint_timer_query extensions on platforms with site isolation: 808744, 870491 Applied Workarounds: enable_webgl_timer_query_extensions Some drivers can't recover after OUT_OF_MEM and context lost: 893177 Applied Workarounds: exit_on_context_lost Avoid waiting on a egl fence before swapping buffers and rely on implicit sync on Intel GPUs: 938286 Applied Workarounds: rely_on_implicit_sync_for_swap_buffers On pre-Ice Lake Intel GPUs MSAA performance is not acceptable for GPU rasterization: 527565, 1298585, 1341830 Applied Workarounds: msaa_is_slow_2 Version Information Data exported 2023-05-20T15:40:36.630Z Chrome version Chrome/111.0.5563.64 Operating system Linux 5.4.0-26-generic Software rendering list URL https://chromium.googlesource.com/chromium/src/+/c710e93d5b63b7095afe8c2c17df34408078439d/gpu/config/software_rendering_list.json Driver bug list URL https://chromium.googlesource.com/chromium/src/+/c710e93d5b63b7095afe8c2c17df34408078439d/gpu/config/gpu_driver_bug_list.json ANGLE commit id cd45d155bf4c 2D graphics backend Skia/111 59932b057f281ddaeb0926ecfac55486270f8c51 Command Line ./chrome --disable-seccomp-filter-sandbox --use-gl=egl --no-sandbox --flag-switches-begin --flag-switches-end Driver Information Initialization time 67 In-process GPU false Passthrough Command Decoder false Sandboxed false GPU0 VENDOR= 0x8086 [Intel], DEVICE=0x3e92 [Mesa Intel(R) UHD Graphics 630 (CFL GT2)], DRIVER_VENDOR=Mesa, DRIVER_VERSION=20.0.4 ACTIVE Optimus false AMD switchable false GPU CUDA compute capability major version 0 Pixel shader version 3.20 Vertex shader version 3.20 Max. MSAA samples 16 Machine model name Machine model version GL_VENDOR Intel GL_RENDERER Mesa Intel(R) UHD Graphics 630 (CFL GT2) GL_VERSION OpenGL ES 3.2 Mesa 20.0.4 GL_EXTENSIONS GL_EXT_blend_minmax GL_EXT_multi_draw_arrays GL_EXT_texture_filter_anisotropic GL_EXT_texture_compression_s3tc GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_rgtc GL_EXT_texture_format_BGRA8888 GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth24 GL_OES_element_index_uint GL_OES_fbo_render_mipmap GL_OES_mapbuffer GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_stencil8 GL_OES_texture_3D GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_vertex_half_float GL_EXT_texture_sRGB_decode GL_OES_EGL_image GL_OES_depth_texture GL_AMD_performance_monitor GL_OES_packed_depth_stencil GL_EXT_texture_type_2_10_10_10_REV GL_NV_conditional_render GL_OES_get_program_binary GL_APPLE_texture_max_level GL_EXT_discard_framebuffer GL_EXT_read_format_bgra GL_EXT_frag_depth GL_NV_fbo_color_attachments GL_OES_EGL_image_external GL_OES_EGL_sync GL_OES_vertex_array_object GL_OES_viewport_array GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_EXT_occlusion_query_boolean GL_EXT_robustness GL_EXT_texture_rg GL_EXT_unpack_subimage GL_NV_draw_buffers GL_NV_read_buffer GL_NV_read_depth GL_NV_read_depth_stencil GL_NV_read_stencil GL_EXT_draw_buffers GL_EXT_map_buffer_range GL_KHR_debug GL_KHR_robustness GL_KHR_texture_compression_astc_ldr GL_OES_depth_texture_cube_map GL_OES_required_internalformat GL_OES_surfaceless_context GL_EXT_color_buffer_float GL_EXT_sRGB_write_control GL_EXT_separate_shader_objects GL_EXT_shader_framebuffer_fetch GL_EXT_shader_implicit_conversions GL_EXT_shader_integer_mix GL_EXT_tessellation_point_size GL_EXT_tessellation_shader GL_INTEL_conservative_rasterization GL_INTEL_performance_query GL_ANDROID_extension_pack_es31a GL_EXT_base_instance GL_EXT_compressed_ETC1_RGB8_sub_texture GL_EXT_copy_image GL_EXT_draw_buffers_indexed GL_EXT_draw_elements_base_vertex GL_EXT_gpu_shader5 GL_EXT_polygon_offset_clamp GL_EXT_primitive_bounding_box GL_EXT_render_snorm GL_EXT_shader_io_blocks GL_EXT_texture_border_clamp GL_EXT_texture_buffer GL_EXT_texture_cube_map_array GL_EXT_texture_norm16 GL_EXT_texture_view GL_KHR_blend_equation_advanced GL_KHR_blend_equation_advanced_coherent GL_KHR_context_flush_control GL_KHR_robust_buffer_access_behavior GL_NV_image_formats GL_OES_copy_image GL_OES_draw_buffers_indexed GL_OES_draw_elements_base_vertex GL_OES_gpu_shader5 GL_OES_primitive_bounding_box GL_OES_sample_shading GL_OES_sample_variables GL_OES_shader_io_blocks GL_OES_shader_multisample_interpolation GL_OES_tessellation_point_size GL_OES_tessellation_shader GL_OES_texture_border_clamp GL_OES_texture_buffer GL_OES_texture_cube_map_array GL_OES_texture_stencil8 GL_OES_texture_storage_multisample_2d_array GL_OES_texture_view GL_EXT_blend_func_extended GL_EXT_buffer_storage GL_EXT_float_blend GL_EXT_geometry_point_size GL_EXT_geometry_shader GL_EXT_shader_samples_identical GL_KHR_no_error GL_KHR_texture_compression_astc_sliced_3d GL_NV_fragment_shader_interlock GL_OES_EGL_image_external_essl3 GL_OES_geometry_point_size GL_OES_geometry_shader GL_OES_shader_image_atomic GL_EXT_clip_cull_distance GL_EXT_disjoint_timer_query GL_EXT_texture_compression_s3tc_srgb GL_MESA_shader_integer_functions GL_EXT_clip_control GL_EXT_texture_compression_bptc GL_KHR_parallel_shader_compile GL_EXT_EGL_image_storage GL_EXT_shader_framebuffer_fetch_non_coherent GL_EXT_texture_sRGB_R8 GL_EXT_texture_shadow_lod GL_MESA_framebuffer_flip_y GL_NV_compute_shader_derivatives GL_EXT_demote_to_helper_invocation GL_EXT_depth_clamp GL_EXT_texture_query_lod Disabled Extensions GL_KHR_blend_equation_advanced GL_KHR_blend_equation_advanced_coherent Disabled WebGL Extensions Window system binding vendor VirtualGL Window system binding version 1.5 Window system binding extensions EGL_EXT_create_context_robustness EGL_EXT_image_dma_buf_import EGL_EXT_pixel_format_float EGL_IMG_context_priority EGL_KHR_cl_event EGL_KHR_cl_event2 EGL_KHR_config_attribs EGL_KHR_create_context EGL_KHR_create_context_no_error EGL_KHR_fence_sync EGL_KHR_get_all_proc_addresses EGL_KHR_gl_colorspace EGL_KHR_gl_renderbuffer_image EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_3D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_image EGL_KHR_image_base EGL_KHR_no_config_context EGL_KHR_reusable_sync EGL_KHR_surfaceless_context EGL_KHR_wait_sync XDG_SESSION_TYPE tty Ozone platform x11 Direct rendering version unknown Reset notification strategy 0x8252 GPU process crash count 1 gfx::BufferFormats supported for allocation and texturing R_8: not supported, R_16: not supported, RG_88: not supported, RG_1616: not supported, BGR_565: not supported, RGBA_4444: not supported, RGBX_8888: not supported, RGBA_8888: not supported, BGRX_8888: not supported, BGRA_1010102: not supported, RGBA_1010102: not supported, BGRA_8888: not supported, RGBA_F16: not supported, YVU_420: not supported, YUV_420_BIPLANAR: not supported, YUVA_420_TRIPLANAR: not supported, P010: not supported Compositor Information Tile Update Mode One-copy Partial Raster Enabled GpuMemoryBuffers Status R_8 Software only R_16 Software only RG_88 Software only RG_1616 Software only BGR_565 Software only RGBA_4444 Software only RGBX_8888 Software only RGBA_8888 Software only BGRX_8888 Software only BGRA_1010102 Software only RGBA_1010102 Software only BGRA_8888 Software only RGBA_F16 Software only YVU_420 Software only YUV_420_BIPLANAR Software only YUVA_420_TRIPLANAR Software only P010 Software only Display(s) Information Info Display[64] bounds=[0,0 1920x1080], workarea=[72,27 1848x1053], scale=1, rotation=0, panel_rotation=0 external. Color space (all) {primaries:BT709, transfer:SRGB, matrix:RGB, range:FULL} Buffer format (all) BGRA_8888 Color volume {name:'srgb', r:[0.6400, 0.3300], g:[0.3000, 0.6000], b:[0.1500, 0.3300], w:[0.3127, 0.3290]} SDR white level in nits 203 HDR relative maximum luminance 1 Bits per color component 8 Bits per pixel 24 Refresh Rate in Hz 60 Video Acceleration Information Decoding Encoding Vulkan Information Device Performance Information Log Messages [12129:12129:0520/113018.067313:ERROR:gpu_init.cc(525)] : Passthrough is not supported, GL is egl, ANGLE is [12129:12129:0520/113018.070599:WARNING:sandbox_linux.cc(393)] : InitializeSandbox() called with multiple threads in process gpu-process. [12129:12129:0520/113018.074374:ERROR:gpu_memory_buffer_support_x11.cc(49)] : dri3 extension not supported. GpuProcessHost: The GPU process crashed! [12346:12346:0520/113041.847830:ERROR:gpu_init.cc(525)] : Passthrough is not supported, GL is egl, ANGLE is [12346:12346:0520/113041.850345:WARNING:sandbox_linux.cc(393)] : InitializeSandbox() called with multiple threads in process gpu-process. [12346:12346:0520/113041.853172:ERROR:gpu_memory_buffer_support_x11.cc(49)] : dri3 extension not supported.

dcommander commented 1 year ago

I have been trying, without success, to diagnose this issue. I have gone so far as to build ANGLE from source, in an attempt to reproduce the issue at that level, but the issue does not reproduce. As near as I can figure:

I don't know what else to try at this point.

ehfd commented 8 months ago

https://developer.chrome.com/blog/supercharge-web-ai-testing?hl=en#enable-webgpu https://chromium.googlesource.com/angle/angle/+/HEAD/src/libANGLE/renderer/vulkan/README.md

Possible workaround methods (known as ANGLE through Vulkan).

dcommander commented 8 months ago

@ehfd That may work with nVidia's Vulkan drivers, which do something VirtualGL-like if they detect an X proxy, but it won't work with multiple GPUs or other GPU drivers. So this still needs to be fixed somehow for the general case.

ehfd commented 8 months ago

@ehfd That may work with nVidia's Vulkan drivers, which do something VirtualGL-like if they detect an X proxy, but it won't work with multiple GPUs or other GPU drivers. So this still needs to be fixed somehow for the general case.

True. But at least there's something. I'm using Vulkan with VirtualGL with NVIDIA GPUs with Xvfb using the container toolkit for the past multiple years.

dcommander commented 7 months ago

There are multiple issues at work here:

  1. Contrary to what I said above, --use-gl=egl is still a valid option. It's just that the GPU process now crashes when you use that option, for unknown reasons, and that causes Chrome to fall back to using ANGLE.
  2. ANGLE defaults to using its Vulkan back end, which may or may not be GPU-accelerated in a remote display environment. Thus, it is necessary to pass --use-angle=gl-egl rather than --use-gl=egl to Chrome. That causes ANGLE to use its EGL/desktop OpenGL back end.
  3. When using ANGLE, VirtualGL's dlopen() interposer interferes with the loading of ANGLE's EGL interposer. Thus, it is necessary to modify VirtualGL's dlopen() interposer to be more strict about matching EGL libraries:
    --- a/server/dlfaker.c
    +++ b/server/dlfaker.c
    @@ -129,7 +129,7 @@ void *dlopen(const char *filename, int flag)
             || (strstr(filename, "/libOpenCL.") && fakeOpenCL)
           #endif
           #ifdef EGLBACKEND
    -      || !strncmp(filename, "libEGL.", 7) || strstr(filename, "/libEGL.")
    +      || !strncmp(filename, "libEGL.so.1", 11) || strstr(filename, "/libEGL.so.1")
           #endif
           || !strncmp(filename, "libX11.", 7) || strstr(filename, "/libX11.")
           || (flag & RTLD_LAZY

    This would have to somehow be made optional, perhaps through a new environment variable (VGL_ANGLE.)

  4. When using ANGLE, Chrome uses GLX to pick an ARGB visual for the Chrome window, even though it is not using GLX for anything else. It tries to find a monographic double-buffered depth=32 visual with an 8-bit alpha channel, no depth buffer, no stencil buffer, and no multisampling. However, Chrome uses raw GLX protocol requests/replies to obtain the OpenGL properties for 2D X server visuals. (By "raw", I don't mean XCB. I mean manually constructing/sending the X11 requests and manually constructing/receiving the X11 replies.) Thus, there is nothing for VirtualGL to interpose, and VirtualGL has no way of controlling which visual Chrome picks. That visual is almost certainly going to be different from the default depth=32 visual returned by XGetVisualInfo(). Chrome then searches for an EGLConfig for which eglGetConfigAttrib(..., EGL_NATIVE_VISUAL_ID, ...) returns the visual that it found using GLX.
  5. I was able to modify VirtualGL and make it use a similar visual picking algorithm when associating a 2D X server visual with a 32-bit EGLConfig. With that modification, I can make GPU acceleration works in Chrome by using the documented application recipe with the argument substitution mentioned above. However, it only works if VGL_PROBEGLX is set to 1, which isn't the default when using TurboVNC or another X proxy.

tl;dr: If I can't find a way to make --use-gl=egl work again, then the alternative is going to be really messy. At this point, I've probably spent well over 50 hours on this, none of them compensated. I can't spend much more time on it without funding or assistance.

dcommander commented 7 months ago

If --use-gl=egl is specified, then Chrome v112+ fails in the same way on the local display without VirtualGL, and the various ANGLE options are only partially accelerated on the local display without VirtualGL. Thus, --use-gl=egl appears to be a dead end.

I am attaching the hacks that I made to VirtualGL in order to make it work with Chrome/ANGLE . These should apply cleanly against d950b031bc66bca3a47d569c385d9d1b89d4140d.

When I use the aforementioned hacks, I can make

VGL_PROBEGLX=1 vglrun google-chrome --disable-seccomp-filter-sandbox --use-angle=gl

work with VirtualGL's GLX back end (but not with VirtualGL's EGL back end, perhaps because ANGLE needs some GLX extensions that the EGL back end doesn't currently emulate.)

When I use the aforementioned hacks, I can make

VGL_PROBEGLX=1 vglrun -d egl google-chrome --disable-seccomp-filter-sandbox --use-angle=gl-egl

work with VirtualGL's EGL/X11 front end and EGL back end, but the WebGL performance is awful, and WebGL sometimes freezes the browser.

I feel like I've done more than due diligence on this issue and am prepared to say that this it is something that needs to be resolved in Chrome.

dcommander commented 7 months ago

Further notes:

Conclusion: Google needs to

  1. fix either --use-gl=egl or --use-angle=gl-egl, and
  2. obtain their main window visual using a less brain-dead technique (or at least a technique that VirtualGL can interpose.)

I can work around (2) but not (1).

dcommander commented 7 months ago

Apparently --use-gl=egl is no longer supported, per Google, because Chrome uses ANGLE for core UI functionality these days, not just for WebGL.

dcommander commented 7 months ago

I went ahead and committed the aforementioned hacks, so you should now be able to run

VGL_CHROMEHACK=1 vglrun [-d egl] google-chrome --disable-seccomp-filter-sandbox --use-angle=gl-egl

I still observe poor performance or rendering pipeline stalls intermittently on certain configurations, particular with nVidia GPUs, so I am not documenting this as a solution yet. Those performance issues also occur on the local display without VirtualGL.

dcommander commented 7 months ago

https://issues.chromium.org/issues/326752458 https://issues.chromium.org/issues/326752457

ehfd commented 7 months ago

Thank you for your work on profiling everything. I hope Chromium fixes ANGLE to make it more friendly to VirtualGL.

sairuk commented 7 months ago

I've just installed VirtualGL and am messing around with it.

In my x2go session with virtualgl 3.1-20230315, chromium v120 and Intel UHD 620 rev 07 I am currently running with VGL_LOGO=1 vglrun -d egl /bin/chromium --use-gl=egl --in-process-gpu results below.

Canvas: Hardware accelerated
Canvas out-of-process rasterization: Disabled
Direct Rendering Display Compositor: Disabled
Compositing: Hardware accelerated
Multiple Raster Threads: Enabled
OpenGL: Enabled
Rasterization: Hardware accelerated on all pages
Raw Draw: Disabled
Skia Graphite: Disabled
Video Decode: Hardware accelerated
Video Encode: Software only. Hardware acceleration disabled
Vulkan: Disabled
WebGL: Hardware accelerated
WebGL2: Hardware accelerated
WebGPU: Disabled

Notes:

dcommander commented 7 months ago

Refer to comment above. You need the latest pre-release build of VirtualGL in order to use --use-angle=gl-egl, and you have to do more than just passing that argument. I have not tried --in-process-gpu. Will give that a try tomorrow and see how (or if) it changes the story, but my understanding from Google was that --use-gl=egl is no longer supported.

sairuk commented 7 months ago

Yes good point, I should've stipulated that as I'm not running post commit 5a91ca0 the angle failure was expected.

i haven't specifically found much doco on --in-process-gpu although from some comments it appears it at a minimum disables the sandbox by default, i assume because gpu is in the main process.

This is a fairly decent cli reference however https://peter.sh/experiments/chromium-command-line-switches/#in-process-gpu

dcommander commented 7 months ago

That's probably a nicer solution than --disable-seccomp-filter-sandbox, since it will apply only to the GPU and do exactly what VirtualGL expects and nothing else. I will test it and let you know how it compares to my ANGLE hack above.

dcommander commented 7 months ago

I can confirm that --in-process-gpu allows --use-gl=egl to work with Chrome v112 and later. Google is making noises like --use-gl=egl will eventually go away, but hopefully this at least buys us time for the ANGLE issues to be resolved on their end.

ehfd commented 3 months ago

It seems that Chrome 126 with --in-process-gpu but without --use-gl=egl starts working?

VirtualGL: about-gpu.txt

Direct X11 GLX and EGL: about-gpu.txt

dcommander commented 3 months ago

I literally said exactly that in the comment above, and the Chrome application recipe now indicates as such.

ehfd commented 3 months ago

Seems like it doesn't even need --use-gl=egl.

dcommander commented 3 months ago

It does in the general case. Not specifying --use-gl=egl causes Chrome to use ANGLE. I encourage you to read the comments above in which I go into great detail about the current issues with VGL and ANGLE. There is an active bug report with the Chrome developers regarding addressing some of those issues.

dcommander commented 3 months ago

Correction: I'm refreshing my memory as I type. If you don't specify --use-gl=egl, then Chrome uses Vulkan. Vulkan may be GPU-accelerated in a remote display environment if you're using an nVidia GPU, but there is no facility for GPU selection if you have multiple GPUs. Vulkan will not work in a remote display environment if you're using a GPU with Mesa-based drivers (e.g. AMD.)

I spent a great deal of time dissecting this issue. Please do not attempt to override my conclusions unless you have spent the same amount of time.

ehfd commented 3 months ago

Chrome uses Vulkan. Vulkan may be GPU-accelerated in a remote display environment if you're using an nVidia GPU, but there is no facility for GPU selection if you have multiple GPUs.

This clears a lot of things up that I was puzzled of, and is probably what I needed to be aware of. Thank you.