PixarAnimationStudios / OpenUSD

Universal Scene Description
http://www.openusd.org
Other
5.48k stars 1.14k forks source link

MESA glsl compiler does not like "centroid" on structure member produced by storm shader #1429

Closed spitzak closed 1 year ago

spitzak commented 3 years ago

Description of Issue

This may be a problem with my build or install, but I am getting a shader compiler error which appears to be fixable if it thinks my glsl is an earlier version.

The error is caused by centroid in struct OutputVertex. However there is a test for #if __VERSION__ < 420 that redefines centroid to be an empty string, apparently removing this error for earlier versions of glsl.

Either my system is reporting the incorrect version, or the shader compiler has a bug, or USD's choice of what version this ability was added is incorrect.

The shader contains #version 460 at the start, which is further confusing as looking at glslProgram.cpp it clearly puts #version 430 at the start of the shader string. All I can guess is that the build edited this c++ code before compiling, but I cannot find this file or any sign of how this is done. I am also wondering if the compilation may edit this string in-place. I have also been unable to locate the code that produces this shader, none of the keywords including the comments can be found by using "git grep" and I cannot locate calls to outside code that produces it.

Assuming this is a problem with my install, would appreciate any help or instruction on how to fix my install or to patch storm so that this version number is changed or the words "centroid" are removed from the shader.

Steps to Reproduce

Shader was dumped by export TF_DEBUG=HDST_DUMP_FAILING_SHADER_SOURCEFILE

System Information (OS, Hardware)

Linux Mint 20

Amdgpu-pro drivers (probably relevant, if AMD is not supported at all please tell me, but Hydra renders the kitchen and some other scenes just fine).

glxinfo excerpt:

Extended renderer info (GLX_MESA_query_renderer):
    Vendor: X.Org (0x1002)
    Device: Radeon Vega Frontier Edition (VEGA10, DRM 3.40.0, 5.4.0-60-generic, LLVM 10.0.1) (0x6863)
    Version: 20.1.6
    Accelerated: yes
    Video memory: 16384MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.6
    Max compat profile version: 4.6
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
Memory info (GL_ATI_meminfo):
    VBO free memory - total: 15890 MB, largest block: 15890 MB
    VBO free aux. memory - total: 14871 MB, largest block: 14871 MB
    Texture free memory - total: 15890 MB, largest block: 15890 MB
    Texture free aux. memory - total: 14871 MB, largest block: 14871 MB
    Renderbuffer free memory - total: 15890 MB, largest block: 15890 MB
    Renderbuffer free aux. memory - total: 14871 MB, largest block: 14871 MB
Memory info (GL_NVX_gpu_memory_info):
    Dedicated video memory: 16384 MB
    Total available memory: 32752 MB
    Currently available dedicated video memory: 15890 MB
OpenGL vendor string: X.Org
OpenGL renderer string: Radeon Vega Frontier Edition (VEGA10, DRM 3.40.0, 5.4.0-60-generic, LLVM 10.0.1)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 20.1.6
OpenGL core profile shading language version string: 4.60

Build Flags

python build_scripts/build_usd.py ~/install/USD

spitzak commented 3 years ago

program1_shader0_GL_GEOMETRY_SHADER.glsl.txt

jilliene commented 3 years ago

Filed as internal issue #USD-6540

poljere commented 3 years ago

Bill, thanks for your report.

Would you mind running with TF_DEBUG=GLF_DEBUG_CONTEXT_CAPS and sharing the output here? I wonder if there is any difference between your glx info and what Storm sees internally.

FYI, Storm requires GL 4.5+, the confusing string in glslProgram.cpp is for compute shaders, and we should update to also be generated using the version caps of the client.

Finally, I wanted to mention that this code specifically seems to be generated by OSD but let's investigate a bit more before moving it there.

spitzak commented 3 years ago
GlfContextCaps: 
  GL_VENDOR                          = X.Org
  GL_RENDERER                        = Radeon Vega Frontier Edition (VEGA10, DRM 3.40.0, 5.4.0-62-generic, LLVM 10.0.1)
  GL_VERSION                         = 4.6 (Compatibility Profile) Mesa 20.1.6
  GL version                         = 460
  GLSL version                       = 460
  GL_MAX_UNIFORM_BLOCK_SIZE          = 2147483392
  GL_MAX_SHADER_STORAGE_BLOCK_SIZE   = 2147483392
  GL_MAX_TEXTURE_BUFFER_SIZE         = 2147483392
  GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT = 4
  ARB_bindless_texture               = 1
  ARB_direct_state_access            = 1
  ARB_explicit_uniform_location      = 1
  ARB_multi_draw_indirect            = 1
  ARB_shader_draw_parameters         = 1
  ARB_shader_storage_buffer_object   = 1
  ARB_shading_language_420pack       = 1
  NV_shader_buffer_load              = 0
Warning: in CompileShader at line 262 of /home/spitzak/USD/pxr/imaging/hdSt/glslProgram.cpp -- Failed to compile shader (GL_GEOMETRY_SHADER): 0:532(2): error: only precision and image qualifiers may be applied to structure members
0:533(2): error: only precision and image qualifiers may be applied to structure members
 ... repeats many times ...
poljere commented 3 years ago

Looking at the error actually, it does sound like Mesa is more restrictive (or other drivers more permissive) about using centroids within structs.

We will need to review this internally and figure out if this is an OpenSubdiv issue and find an appropriate solution.

spitzak commented 3 years ago

I think it may be possible to convince Mesa developers to ignore this rather than produce an error.

spitzak commented 3 years ago

Error is in the Mesa parser: https://github.com/mesa3d/mesa/blob/master/src/compiler/glsl/glsl_parser.yy#L2434

spitzak commented 2 years ago

Any chance this could be looked into? I have confirmed this problem still exists with the newest dev branch and Linux Mint 20.3.

Failed to compile shader (FRAGMENT_SHADER): 0:1608(2): error: only precision qualifiers may be applied to structure members
0:1609(2): error: only precision qualifiers may be applied to structure members
davidgyu commented 2 years ago

Hi Bill, Thanks again for your patience with this!

These "centroid" interpolation qualifiers are coming from OpenSubdiv shader source and injected into Hydra Storm's shaders. I think these qualifiers should be removed from OpenSubdiv's struct member declarations, and until that happens we might be able to defensively guard against them where Storm includes these shaders.

Unfortunately, we don't have a timeline at the moment for when we might be able to do this work.

spitzak commented 2 years ago

I'm actually having a hard time locating where OpenSubdiv is located on my machine, is there any chance this stuff is in a file there that I can edit?

It seems like the fastest and best fix is to change Mesa to ignore (or only warn) about them. I am confused about why these attributes should actually be removed or ignored:

It does sound like they do something to interpolation of samples, but the effect is minor and you will get a very-close-to-correct render if the attributes were ignored. Thus unless openSubdiv is actually wrong about using them it seems like they should stay, and Mesa be made to ignore them.

I have no idea if or whether Pixar has enough clout to get the Mesa developers to fix this.

davidgyu commented 2 years ago

If you are building USD locally, e.g. by using build_usd.py, then you might be building OpenSubdiv locally as part of your USD build. If so, then you might be able to patch your build locally to edit the file which corresponds to opensubdiv/osd/glslPatchCommon.glsl and remove these "centroid" auxiliary storage qualifiers.

Additional background: I think Mesa is correct to flag this as an error, the GLSL spec is clear that "Function return types and structure members do not use storage qualifiers". You are correct that the effect of the storage and interpolation qualifiers can be subtle, but significant. Our plan is to remove these from OpenSubdiv core shader code since client shader code is a better place to deal with these concerns. These specific "centroid" storage qualifiers have no effect on the correctness of Hydra Storm (other than raising shader compilation errors for your use case!).

Apologies for the continued presence of this long-standing bug in the OpenSubdiv GLSL shader source. We plan to address this, we just can't tell you yet when we will be able to do so.

spitzak commented 2 years ago

It does look like the build script got the source for OpenSubdiv from somewhere and compiled it locally. However I have not figured out how to get it to be recompiled with this code changed.

I edited ~/install/src/OpenSubdiv-3_4_3/opensubdiv/osd/glslPatchCommon.glsl and removed the "centroid"/ Then tried running python build_scripts/build_usd.py ~/install again. This did not cause anything to happen.

I then removed all binary files found by grep -r "centroid vec4" <installdir> and tried again and I got:

No rule to make target '<installdir>/lib/libosdGPU.so', needed by 'pxr/imaging/pxOsd/libusd_pxOsd.so'. Stop.

sunyab commented 2 years ago

@spitzak build_usd.py likely sees that you've already downloaded and built OpenSubdiv so it skips trying to rebuild that directory. If you remove ~/install/include/opensubdiv and then re-run build_usd.py, it should get OSD to rebuild.

davidgyu commented 1 year ago

The non-standard "centroid" qualifiers were removed starting with OpenSubdiv 3.5.x and this problem has been fixed in USD since v23.02 w/ OpenSubdiv 3.5.x.

I don't have your exact system configuration available but I've just re-tested with v23.05-rc1 and confirmed that this is no longer a problem when running on Windows 10 or Ubuntu 22.04 w/ AMD, NVIDIA, or Intel GPUs (including Mesa 22.2.5 as well as vendor drivers on Ubuntu).

We're going to mark this issue closed but let us know if there are other new issues. Thanks!