KhronosGroup / OpenGL-API

OpenGL and OpenGL ES API Issue Tracker
34 stars 5 forks source link

OpenGL 4.6: Do atomic counter buffers require the use of `glMemoryBarrier` calls to be able to access the counter? #14

Closed NicolBolas closed 4 years ago

NicolBolas commented 7 years ago

The specification seems rather ambiguous about this.

On the one hand, section 7.12 says this:

As described in the OpenGL Shading Language Specification, shaders may perform random-access reads and writes to buffer object memory by reading from, assigning to, or performing atomic memory operation on shader buffer variables, or to texture or buffer object memory by using built-in image load, store, and atomic functions operating on shader image variables. The ability to perform such random-access reads and writes in systems that may be highly pipelined results in ordering and synchronization issues discussed in the sections below.

Atomic counters are conspicuous by their absence. Atomic counters are not "shader image variables" or "shader buffer variables." As such, the rest of 7.12 should not apply to them.

The other thing is that the ARB_shader_atomic_counter extension doesn't state any dependency on ARB_shader_image_load_store, which is where glMemoryBarrier comes from. That suggests that it's perfectly possible for an implementation to give us ARB_shader_atomic_counter without ARB_shader_image_load_store. That the two are essentially independent.

And yet... the GL_ATOMIC_COUNTER_BARRIER_BIT is described as:

Memory accesses using shader atomic counter built-in functions issued after the barrier will reflect data written by shaders prior to the barrier. Additionally, atomic counter function invocations after the barrier will not execute until all memory accesses (e.g., loads, stores, texture fetches, vertex fetches) initiated prior to the barrier complete.

What is curious about this is the "all memory accesses" part. This specific wording is only used in two other bits: shader storage and shader image. Other barriers for coherent operations like query buffers and transform feedbacks tend to use the term "all shader writes".

The wording appears to suggest that, without this bit, atomic counter operations could execute before, for example, vertex fetches from that buffer have completed.

One could assume that if you render using a particular buffer, binding the buffer for use as an atomic counter and rendering with a shader that does atomic counter operations would be sufficient to force the implementation to synchronize between those two states. But the language of the atomic counter barrier bit suggests that you need an explicit barrier.

If the intent is that atomic counters are implicitly synchronized like query buffers and most other GL writing operations, then the language of the atomic counter barrier should be changed. However, if the intent is that atomic counters require explicit barriers, then the first paragraph of section 7.12 should be updated to explicitly include atomic counters as a "shader write" operation.

And of course that ARB_shader_atomic_counter should probably have some requirement on ARB_shader_image_load_store.

pdaniell-nv commented 6 years ago

We have an internal analysis of this issue, which can be seen by members here: https://gitlab.khronos.org/opengl/WG-Docs/wikis/shader_atomic_counters_coherency.

We are discussing this issue internally and hope to come to a resolution soon.

pdaniell-nv commented 6 years ago

We discussed this again in the OpenGL|ES meeting. Based on feedback from IHVs and their implementation of atomic counters we're planning to treat them like we treat other resources like image atomic, image load/store, buffer variables, etc. in that they require explicit synchronization from the application. The spec will be changed to add "atomic counters" to the places where the other resources are enumerated.

We also observed that OpenGL ES has support for atomic counters and may also need to be fixed in the same way.

pdaniell-nv commented 6 years ago

I believe this fix is still outstanding. Assigning @oddhack to pick this up for the next API update. The changes should be made to both OpenGL and OpenGL ES:

In "GL:7.13/ES:7.11 Shader Memory Access" of both specs, add the following:

..., shaders may perform random-access reads and writes to buffer object memory by reading from, assigning to, or performing atomic memory operation on shader buffer variables, or to texture or buffer object memory by using built-in image load, store, and atomic functions operating on shader image variables, or performing atomic operations on atomic counter variables.

oddhack commented 5 years ago

Fixed internally - this will show up in the next public spec update.

oddhack commented 4 years ago

Fixed in the 2019-10-22 spec updates.