Open urdeveloper opened 1 year ago
This kind of issues are highly dependent on hardware/OS and very hard to reproduce. Do you mind sharing your machine information from chrome://gpu?
Thanks for looking into this. I completely agree that we'll get different results with different hardware/OS. What's interesting is that I see the same performance issue on MacOS when using Chrome/Edge but not Safari.
Wasn't sure if there was any better way to show the log here, let me know you wanted to reduce it:
Graphics Feature Status Canvas: Hardware accelerated Canvas out-of-process rasterization: Disabled Direct Rendering Display Compositor: Disabled Compositing: Hardware accelerated Multiple Raster Threads: Enabled OpenGL: Enabled Rasterization: Hardware accelerated Raw Draw: Disabled Video Decode: Hardware accelerated Video Encode: Hardware accelerated Vulkan: Disabled WebGL: Hardware accelerated WebGL2: Hardware accelerated WebGPU: Hardware accelerated Driver Bug Workarounds check_ycbcr_studio_g22_left_p709_for_nv12_support clear_uniforms_before_first_program_use decode_encode_srgb_for_generatemipmap disable_accelerated_av1_encode disable_decode_swap_chain disable_direct_composition_sw_video_overlays disable_dynamic_video_encode_framerate_update disable_vp_super_resolution enable_bgra8_overlays_with_yuv_overlay_support enable_webgl_timer_query_extensions exit_on_context_lost max_msaa_sample_count_4 msaa_is_slow msaa_is_slow_2 no_downscaled_overlay_promotion disabled_extension_GL_KHR_blend_equation_advanced disabled_extension_GL_KHR_blend_equation_advanced_coherent Problems Detected Some drivers are unable to reset the D3D device in the GPU process sandbox Applied Workarounds: exit_on_context_lost Clear uniforms before first program use on all platforms: 124764, 349137 Applied Workarounds: clear_uniforms_before_first_program_use On Intel GPUs MSAA performance is not acceptable for GPU rasterization: 527565, 1298585 Applied Workarounds: msaa_is_slow Disable KHR_blend_equation_advanced until cc shaders are updated: 661715 Applied Workarounds: disable(GL_KHR_blend_equation_advanced), disable(GL_KHR_blend_equation_advanced_coherent) Decode and Encode before generateMipmap for srgb format textures on Windows: 634519 Applied Workarounds: decode_encode_srgb_for_generatemipmap Expose WebGL's disjoint_timer_query extensions on platforms with site isolation: 808744, 870491 Applied Workarounds: enable_webgl_timer_query_extensions Disable DecodeSwapChain for Intel Gen9 and older devices: 1107403 Applied Workarounds: disable_decode_swap_chain Intel GPUs fail to report BGRA8 overlay support: 1119491 Applied Workarounds: enable_bgra8_overlays_with_yuv_overlay_support 8x MSAA for WebGL contexts is slow on Win Intel: 1145793 Applied Workarounds: max_msaa_sample_count_4 Disable software overlays for Intel GPUs. All Skylake+ devices support hw overlays, older devices peform poorly.: 1192748 Applied Workarounds: disable_direct_composition_sw_video_overlays Check YCbCr_Studio_G22_Left_P709 color space for NV12 overlay support on Intel: 1103852 Applied Workarounds: check_ycbcr_studio_g22_left_p709_for_nv12_support Intel GPUs do not promote downscaled overlays: 1245835 Applied Workarounds: no_downscaled_overlay_promotion AVC/AV1 hardware encoder MFT output bitrate incorrect upon framerate update on Intel GPUs.: 1295815 Applied Workarounds: disable_dynamic_video_encode_framerate_update Don't use video processor super resolution on Intel Gen9 and older GPUs and non-Intel GPUs.: 1318380 Applied Workarounds: disable_vp_super_resolution On pre-Ice Lake Intel GPUs MSAA performance is not acceptable for GPU rasterization: 527565, 1298585, 1341830 Applied Workarounds: msaa_is_slow_2 Disable hardware MFT Av1 encoder on machines with multiple GPUs: 1367038 Applied Workarounds: disable_accelerated_av1_encode ANGLE Features allowCompressedFormats (Frontend workarounds): Enabled: true Allow compressed formats cacheCompiledShader (Frontend features) anglebug:7036: Disabled Enable to cache compiled shaders disableAnisotropicFiltering (Frontend workarounds): Disabled Disable support for anisotropic filtering disableProgramBinary (Frontend features) anglebug:5007: Disabled Disable support for GL_OES_get_program_binary disableProgramCachingForTransformFeedback (Frontend workarounds): Disabled On some GPUs, program binaries don't contain transform feedback varyings emulatePixelLocalStorage (Frontend features) anglebug:7279: Disabled: false Emulate ANGLE_shader_pixel_local_storage using shader images enableCaptureLimits (Frontend features) anglebug:5750: Disabled Set the context limits like frame capturing was enabled enableCompressingPipelineCacheInThreadPool (Frontend workarounds) anglebug:4722: Disabled: false Enable compressing pipeline cache in thread pool. enableProgramBinaryForCapture (Frontend features) anglebug:5658: Disabled Even if FrameCapture is enabled, enable GL_OES_get_program_binary forceDepthAttachmentInitOnClear (Frontend workarounds) anglebug:7246: Disabled: isAMD Force depth attachment initialization on clear ops forceGlErrorChecking (Frontend features) https://issuetracker.google.com/220069903: Disabled Force GL error checking (i.e. prevent applications from disabling error checking forceInitShaderVariables (Frontend features): Disabled Force-enable shader variable initialization forceRobustResourceInit (Frontend features) anglebug:6041: Disabled Force-enable robust resource init loseContextOnOutOfMemory (Frontend workarounds): Enabled: true Some users rely on a lost context notification if a GL_OUT_OF_MEMORY error occurs scalarizeVecAndMatConstructorArgs (Frontend workarounds) 1165751: Disabled: false Always rewrite vec/mat constructors to be consistent addMockTextureNoRenderTarget (D3D workarounds) anglebug:2152: Disabled: isIntel && capsVersion >= IntelDriverVersion(160000) && capsVersion < IntelDriverVersion(164815) On some drivers when rendering with no render target, two bugs lead to incorrect behavior allowClearForRobustResourceInit (D3D workarounds) 941620: Enabled: true Some drivers corrupt texture data when clearing for robust resource initialization. allowES3OnFL100 (D3D workarounds): Disabled: false Allow ES3 on 10.0 devices allowTranslateUniformBlockToStructuredBuffer (D3D workarounds) anglebug:3682: Enabled: IsWin10OrGreater() There is a slow fxc compile performance issue with dynamic uniform indexing if translating a uniform block with a large array member to cbuffer. callClearTwice (D3D workarounds) 655534: Disabled: isIntel && isSkylake && capsVersion >= IntelDriverVersion(160000) && capsVersion < IntelDriverVersion(164771) Using clear() may not take effect depthStencilBlitExtraCopy (D3D workarounds) anglebug:1452: Disabled Bug in some drivers triggers a TDR when using CopySubresourceRegion from a staging texture to a depth/stencil disableB5G6R5Support (D3D workarounds): Disabled: (isIntel && capsVersion >= IntelDriverVersion(150000) && capsVersion < IntelDriverVersion(154539)) || isAMD Textures with the format DXGI_FORMAT_B5G6R5_UNORM have incorrect data disableRasterizerOrderViews (D3D workarounds) anglebug:7279: Disabled Disable ROVs for testing emulateIsnanFloat (D3D workarounds) 650547: Disabled: isIntel && isSkylake && capsVersion >= IntelDriverVersion(160000) && capsVersion < IntelDriverVersion(164542) Using isnan() on highp float will get wrong answer emulateTinyStencilTextures (D3D workarounds): Disabled: isAMD && !(deviceCaps.featureLevel < D3D_FEATURE_LEVEL_10_1) 1x1 and 2x2 mips of depth/stencil textures aren't sampled correctly expandIntegerPowExpressions (D3D workarounds): Enabled: true The HLSL optimizer has a bug with optimizing 'pow' in certain integer-valued expressions flushAfterEndingTransformFeedback (D3D workarounds): Disabled: isNvidia Some drivers sometimes write out-of-order results to StreamOut buffers when transform feedback is used to repeatedly write to the same buffer positions forceAtomicValueResolution (D3D workarounds) anglebug:3246: Disabled: isNvidia On some drivers the return value from RWByteAddressBuffer.InterlockedAdd does not resolve when used in the .yzw components of a RWByteAddressBuffer.Store operation getDimensionsIgnoresBaseLevel (D3D workarounds): Disabled: isNvidia Some drivers do not take into account the base level of the texture in the results of the HLSL GetDimensions builtin mrtPerfWorkaround (D3D workarounds): Enabled: true Some drivers have a bug where they ignore null render targets preAddTexelFetchOffsets (D3D workarounds): Enabled: isIntel HLSL's function texture.Load returns 0 when the parameter Location is negative, even if the sum of Offset and Location is in range rewriteUnaryMinusOperator (D3D workarounds): Disabled: isIntel && (isBroadwell || isHaswell) && capsVersion >= IntelDriverVersion(150000) && capsVersion < IntelDriverVersion(154624) Evaluating unary minus operator on integer may get wrong answer in vertex shaders selectViewInGeometryShader (D3D workarounds): Disabled: !deviceCaps.supportsVpRtIndexWriteFromVertexShader The viewport or render target slice will be selected in the geometry shader stage for the ANGLE_multiview extension setDataFasterThanImageUpload (D3D workarounds): Enabled: !(isIvyBridge || isBroadwell || isHaswell) Set data faster than image upload skipVSConstantRegisterZero (D3D workarounds): Disabled: isNvidia In specific cases the driver doesn't handle constant register zero correctly useInstancedPointSpriteEmulation (D3D workarounds): Disabled: isFeatureLevel9_3 Some D3D11 renderers do not support geometry shaders for pointsprite emulation useSystemMemoryForConstantBuffers (D3D workarounds) 593024: Enabled: isIntel Copying from staging storage to constant buffer storage does not work zeroMaxLodWorkaround (D3D workarounds): Disabled: isFeatureLevel9_3 Missing an option to disable mipmaps on a mipmapped texture DAWN Info
Did more debugging today. The bottleneck is in luma.gl here: https://github.com/visgl/luma.gl/blob/337c12325d3aa1699245fbdc0fba7f9252a29996/modules/webgl/src/classes/program.js#L360
Above code tries to get the number of uniforms from the program.
Added some timing to the code in luma and here's the result:
First number is the time took by that line of code in luma.gl and second number is the total spent time to draw a simple PathLayer for the first time.
Description
Adding a GeoJSON layer with one LineString feature (two vertexes) takes about 500ms. I was only able to reproduce this in Chrome and Firefox on Windows and MacOS. I don't see this behavior when using Safari on iOS or MacOS.
Flavors
Expected Behavior
My expectation was creating a layer in its simplest form should almost be instantaneous.
Steps to Reproduce
To reproduce the issue, please run the following CodePen: https://codepen.io/urdeveloper-the-flexboxer/pen/dyKzgeB
Click on
Draw Path
, thenDraw Point
thenDraw Path
. The delay to draw a red line is very noticeable. Drawing the red point is roughly three times faster but still slow in my opinion.Environment
Logs
The warnings from Chrome:
(After clicking on
Draw Path
)[Violation] 'requestAnimationFrame' handler took 593ms
(After clicking on
Draw Point
)[Violation] 'requestAnimationFrame' handler took 164ms
(After clicking on
Draw Point
again)[Violation] 'requestAnimationFrame' handler took 560ms