KhronosGroup / WebGL

The Official Khronos WebGL Repository
Other
2.63k stars 668 forks source link

change WebGL2 spec slightly to help devs find race conditions? #2692

Open greggman opened 6 years ago

greggman commented 6 years ago

Currently the WebGL2 spec for getQueryParameter says

In order to ensure consistent behavior across platforms, queries' results must only be made available when the user agent's event loop is not executing a task. In other words:

  • A query's result must not be made available until control has returned to the user agent's main loop.
  • Repeatedly fetching a query's QUERY_RESULT_AVAILABLE parameter in a loop, without returning control to the user agent, must always return the same value.

getSyncParameter has similar wording

I'd like to suggest the spec change here that the user must call getQueryParameter(QUERY_RESULT_AVAILABLE) and receive true and that if they don't then calling getQueryParameter(QUERY_RESULT) generates INVALID_OPERATION

As it is a dev can just call getQueryParameter(QUERY_RESULT) and assume the result is available. It might not be and they have a race in their code that is nearly impossible to find. It might just be some intermittent flicker. It might also only happen on some user's hardware which the dev has no access too. They might not have waited a frame (porting native code) and just assumed that 50% of the way through a frame reading the query result was OK.

To help find these issues I can see at least 2 possible solutions.

  1. Require the WebGL2 implementation to check QUERY_RESULT_AVAILABLE when the user queries QUERY_RESULT. You can imagine that implementation as
// pseudo code
bool WebGL2RenderingContext::getQueryParameter(WebGLQuery* query, GLenum pname) {
   switch (pname) {
      case GL_QUERY_RESULT: {
         GLuint available;
         glGetQueryObjectuiv(query->id(), GL_QUERY_RESULT_AVAILABLE, &available);
         if (!available) {
             WebGL2RenderingContext::generateError(GL_INVALID_OPERATION, "tried to query result when result not available");
             return null;
         }
         GLuint result;
         glGetQueryObjectuiv(query->id(), GL_QUERY_RESULT, &result);
         return result;
   ....
   }
}

The problem with this solution is there is no way to test it since it requires a query to not be ready in order to test. At best you can test that the user didn't get the result in the same frame but you can not test that having waited for the next frame a race was caught.

Vs the suggested solution of requiring the user to query QUERY_RESULT_AVAILABLE and get true could be something like

// pseudo code
bool WebGL2RenderingContext::getQueryParameter(WebGLQuery* query, GLenum pname) {
   switch (pname) {
      case GL_QUERY_RESULT_AVAILABLE: {
         GLuint available;
         glGetQueryObjectuiv(query->id(), GL_QUERY_RESULT_AVAILABLE, &available);
         // mark the user correctly checked the query is available. 
         // (was cleared when the query is submitted)
         query.setAvailable(available);  
         return available;
      case QUERY_RESULT:
         if (!query->getAvailable()) {
             WebGL2RenderingContext::generateError(GL_INVALID_OPERATION, "tried to query result without checking if result available and getting true");
             return null;
         }
         GLuint result;
         glGetQueryObjectuiv(query->id(), GL_QUERY_RESULT, &result);
         return result;
   ....
   }
}

Now it's easily testable.

This shouldn't affect well written programs. It's trivial for devs to fix their code if they aren't currently calling getQueryParameter(gl.QUERY_RESULT_AVAILABLE). And, even if they don't check the result of QUERY_RESULT_AVAILABLE WebGL at least has an easy way to generate an error telling them they have a race condition.

It seems like the same would apply to getSyncParameter and other future async stuff?

kdashg commented 6 years ago

I don't believe it's technically required to ask for each query one-by-one, since (I believe) you can determine that they're ready via fences. (or finish()) Modulo our next-frame rules, of course)

FWIW, this in an area where browsers should (and perhaps already do) issue warnings.

On Tue, Aug 7, 2018 at 6:40 PM, Greggman notifications@github.com wrote:

Currently the WebGL2 spec for getQueryParameter says

In order to ensure consistent behavior across platforms, queries' results must only be made available when the user agent's event loop is not executing a task. In other words:

  • A query's result must not be made available until control has returned to the user agent's main loop.
  • Repeatedly fetching a query's QUERY_RESULT_AVAILABLE parameter in a loop, without returning control to the user agent, must always return the same value.

getSyncParameter has similar wording

I'd like to suggest the spec change here that the user must call getQueryParameter(QUERY_RESULT_AVAILABLE) and receive true and that if they don't then calling getQueryParameter(QUERY_RESULT) generates INVALID_OPERATION

As it is a dev can just call getQueryParameter(QUERY_RESULT) and assume the result is available. It might not be and they have a race in their code that is nearly impossible to find. It might just be some intermittent flicker. It might also only happen on some user's hardware which the dev has no access too. They might not have waited a frame (porting native code) and just assumed that 50% of the way through a frame reading the query result was OK.

To help find these issues I can see at least 2 possible solutions.

  1. Require the WebGL2 implementation to check QUERY_RESULT_AVAILABLE when the user queries QUERY_RESULT. You can imagine that implementation as

// pseudo code bool WebGL2RenderingContext::getQueryParameter(WebGLQuery* query, GLenum pname) { switch (pname) { case GL_QUERY_RESULT: { GLuint available; glGetQueryObjectuiv(query->id(), GL_QUERY_RESULT_AVAILABLE, &available); if (!available) { WebGL2RenderingContext::generateError(GL_INVALID_OPERATION, "tried to query result when result not available"); return null; } GLuint result; glGetQueryObjectuiv(query->id(), GL_QUERY_RESULT, &result); return result; .... } }

The problem with this solution is there is no way to test it since it requires a query to not be ready in order to test. At best you can test that the user didn't get the result in the same frame but you can not test that having waited for the next frame a race was caught.

Vs the suggested solution of requiring the user to query QUERY_RESULT_AVAILABLE and get true could be something like

// pseudo code bool WebGL2RenderingContext::getQueryParameter(WebGLQuery* query, GLenum pname) { switch (pname) { case GL_QUERY_RESULT_AVAILABLE: { GLuint available; glGetQueryObjectuiv(query->id(), GL_QUERY_RESULT_AVAILABLE, &available); // mark the user correctly checked the query is available. // (was cleared when the query is submitted) query.setAvailable(available); return available; case QUERY_RESULT: if (!query->getAvailable()) { WebGL2RenderingContext::generateError(GL_INVALID_OPERATION, "tried to query result without checking if result available and getting true"); return null; } GLuint result; glGetQueryObjectuiv(query->id(), GL_QUERY_RESULT, &result); return result; .... } }

Now it's easily testable.

This shouldn't affect well written programs. It's trivial for devs to fix their code if they aren't currently calling getQueryParameter(gl.QUERY_ RESULT_AVAILABLE). And, even if they don't check the result of QUERY_RESULT_AVAILABLE WebGL at least has an easy way to generate an error telling them they have a race condition.

It seems like the same would apply to getSyncParameter and other future async stuff?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/KhronosGroup/WebGL/issues/2692, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZ0jU7cpUkWLm3HQAbn1soi_mTDKLcHks5uOkGagaJpZM4VzIzC .

kenrussell commented 6 years ago

@jdashg agreed. It's not a good compatibility decision to require users to check QUERY_RESULT_AVAILABLE before asking for QUERY_RESULT.

It would be fairly straightforward to generate INVALID_OPERATION from getQueryParameter(query, gl.QUERY_RESULT) if asking for QUERY_RESULT_AVAILABLE would return false. Open to making that change in the spec and Chromium. @jdashg what do you think?

Sync objects only have one queryable parameter which can change – SYNC_STATUS – so getSyncParameter shouldn't generate an error if that would return UNSIGNALED.