joelgwebber / quake2-web

Automatically exported from code.google.com/p/quake2-gwt-port
GNU General Public License v2.0
5 stars 9 forks source link

Game fails on latest Chromium nightly (As of Apr 3) without --no-sandbox #11

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
The latest Chromium builds can run WebGL without the --no-sandbox option. But 
the game fails in 
this configuration, possibly because WebGL is a strict OpenGL ES 
implementation, and we're passing 
an invalid GL enum somewhere.

See this Chromium issue:
http://code.google.com/p/chromium/issues/detail?id=40151

Original issue reported on code.google.com by joelgwebber on 6 Apr 2010 at 2:43

GoogleCodeExporter commented 8 years ago
Trying to track this down by compiling with -style pretty and cutting calls out 
until we find the source of the 
failure.

So far, we have R_Init() generating GL errors:
- R_InitParticleTexture(): 0x502 (GL_INVALID_OPERATION)
- [somewhere else] 0x500 (GL_INVALID_ENUM)

Original comment by joelgwebber on 6 Apr 2010 at 2:46

GoogleCodeExporter commented 8 years ago
You might want to try wrapping the WebGL context with a JavaScript object that 
checks
all calls.

See this code

http://www.khronos.org/webgl/wiki/Debugging

https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/sdk/debug/webgl-de
bug.js

Original comment by g...@google.com on 6 Apr 2010 at 3:46

GoogleCodeExporter commented 8 years ago
Thanks, gman. That really helped. Here are the cases that are generating 
problems:

Misc.GL_SetDefaultState():
 INVALID_ENUM:  Misc.java:208 glEnable(GL_TEXTURE_2D)
 INVALID_ENUM:  Misc.java:210 glEnable(GL_ALPHA_TEST)
 INVALID_VALUE: Misc.java:226 glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, gl_filter_min == 9985)
 INVALID_VALUE: Misc.java:227 glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, gl_filter_max == 9729)
 INVALID_VALUE: Misc.java:229 glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT)
 INVALID_VALUE: Misc.java:230 glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT)

Main.R_Init():
 Main.java:1288 glGetString(err == 1280)

Main.BeginFrame():
 INVALID_ENUM: Main.java:1377 glEnable(GL_ALPHA_TEST)

Main.R_Clear():
 INVALID_FRAMEBUFFER_OPERATION: Main.java:806 glClear(GL_DEPTH_BUFFER_BIT)
 INVALID_FRAMEBUFFER_OPERATION: Main.java:820 glDepthFunc(GL_LEQUAL)

GLAdapter.glEnd():
 INVALID_FRAMEBUFFER_OPERATION: GLAdapter.java:453 glDrawArrays(GL_TRIANGLE_FAN, i, 4);

Original comment by joelgwebber on 6 Apr 2010 at 5:41

GoogleCodeExporter commented 8 years ago
Whoops, the third line should read:
INVALID_VALUE: Misc.java:227 glTexParameterf(GL_TEXTURE_2D, 
GL_TEXTURE_MAG_FILTER, gl_filter_max == 
9729)

Original comment by jgw@google.com on 6 Apr 2010 at 6:19

GoogleCodeExporter commented 8 years ago
I disabled all the calls to apparently illegal-in-webgl stuff:
- glEnable/Disable(GL_TEXTURE_2D)
- glEnable/Disable(GL_ALPHA_TEST)
- glTexParameter(...)

as well as a call to R_InitParticleTexture(), which ultimately causes an error 
from glTexImage2D(), and it now 
starts up without errors.

With --no-sandbox, it (unsurprisingly) renders without textures, but otherwise 
works. With the sandbox 
enabled, it doesn't report problems, but only renders a black screen. I seem to 
recall something on another 
thread about using arbitrary ints for texture ids (as opposed to using 
createTexture()/glGenTextures()) being 
problematic. Anyone know if that's still the case? If so, it could be a major 
source of problems.

Original comment by joelgwebber on 6 Apr 2010 at 7:15

GoogleCodeExporter commented 8 years ago
Neither the sandboxed nor the in-process WebGL implementations in Chrome 
support passing an integer to 
bindTexture.

One OpenGL ES 2.0 semantic enforced by the sandboxed implementation is 
non-power-of-two texture 
handling. If the top level of a texture is NPOT, then the minification filter 
must be either NEAREST or LINEAR, and 
the S and T wrap modes must be CLAMP_TO_EDGE. Otherwise the texture will render 
as black.

Original comment by kbr@chromium.org on 6 Apr 2010 at 8:57

GoogleCodeExporter commented 8 years ago
Nevermind, I'm an idiot -- can't read my own code. Looking into those texture 
restrictions now...

Original comment by joelgwebber on 6 Apr 2010 at 9:08

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
The integer ids are mapped internally in jake2.gwt.client.WebGLAdapter, so 
there should not be a problem 
with texture ids.

To enable glError output, uncomment the content of WebGLAdapter.checkError().

We enforce 2^n sizes for all images except the text and the sky box by 
rendering the original image to a 
canvas with a 2^n size. We do not use any other filter than NEAREST and LINEAR. 

We do not set any wrap modes, this may be a problem if the default is 
different. But then, it should only 
matter for NPOT textures....

Original comment by stefan.haustein on 7 Apr 2010 at 10:15

GoogleCodeExporter commented 8 years ago
If you're resampling the textures to have power-of-two dimensions and use 
either NEAREST or LINEAR filtering 
then everything should work. The default wrap modes are the same as in desktop 
GL (REPEAT for both S and T).

The OpenGL ES 2.0 semantics for NPOT textures, which Gregg's command buffer 
code correctly implements, are 
that you must (a) use one of the non-mipmapped filters (which you're already 
doing) and (b) set the S and T 
texture repeat modes to CLAMP_TO_EDGE, or you will get black when sampling the 
texture.

Original comment by kbr@chromium.org on 7 Apr 2010 at 10:42

GoogleCodeExporter commented 8 years ago
I get INVALID_VALUE in all of the following cases (I changed the code locally 
to only use LINEAR and CLAMP_TO_EDGE):
- texParameterf(TEXTURE_2D, TEXTURE_MIN_FILTER, LINEAR)
- texParameterf(TEXTURE_2D, TEXTURE_MAG_FILTER, LINEAR)
- texParameterf(TEXTURE_2D, TEXTURE_WRAP_S, CLAMP_TO_EDGE)
- texParameterf(TEXTURE_2D, TEXTURE_WRAP_T, CLAMP_TO_EDGE)

So I just turned those off to see if that helps.

The errors go away, but then I get INVALID_OPERATION for many texImage2D() 
calls. This turns out to be a weird Quake-ism that uses strange values 
for the 'internalFormat' parameter. Changing them all to RGBA (to match the 
'format' parameter) fixes that problem.

Which leaves me with a large number of INVALID_FRAMEBUFFER_OPERATION errors in 
drawArrays(TRIANGLE_FAN, 0, 4), which seems odd, since the 
buffers appear to be bound and filled in correctly (we know the logic's fine 
for regular GL, so I doubt it's a trivial logic bug).

Thoughts?

Original comment by joelgwebber on 8 Apr 2010 at 2:36

GoogleCodeExporter commented 8 years ago
I also added the webgl-debug.js script (modified to echo *all* gl calls even if 
they don't error). To enable it, just 
add "<script src='webgl-debug.js'></script>" to GwtQuake.html above the 
.nocache.js script tag. It's a lot of log 
spew, but helps immensely in tracking this stuff down.

Original comment by joelgwebber on 8 Apr 2010 at 2:37

GoogleCodeExporter commented 8 years ago
I added a call to checkFramebufferStatus() just before the drawArrays() call, 
and got 
GL_FRAMEBUFFER_UNSUPPORTED. Unfortunately, it's not obvious to me what 
precisely it's complaining about. I 
changed all the texture formats to RGBA, so this is slightly surprising.

Original comment by joelgwebber on 8 Apr 2010 at 3:41

GoogleCodeExporter commented 8 years ago
In the GLES2 command buffer code, the texParameterf calls will fail with 
INVALID_VALUE if there is no texture 
bound to the target on the current texture unit. Double-check that in the 
quake2 code.

It's only possible for GL_FRAMEBUFFER_UNSUPPORTED to be raised if you are 
allocating FBOs yourself. Does the 
quake2 code do this? I would be surprised. If it does, then the combination of 
color+depth buffers should be well 
supported, but it's possible that allocating a stencil buffer won't work.

Original comment by kbr@chromium.org on 8 Apr 2010 at 6:46

GoogleCodeExporter commented 8 years ago
I'll look more carefully into the sequence of events that leads to those 
INVALID_VALUEs. It may indeed be an 
unbound texture. At a guess, it doesn't sound like this should causing the 
framebuffer problems, though.

I don't see any code that remotely touches FBOs or stencils (I don't even think 
they were widely supported when 
this code was originaly written). Any other ideas on what could trip a 
GL_FRAMEBUFFER_UNSUPPORTED? I've got a 
local Chrome build I could sync if there's a straightforward way of debugging 
the command buffers.

Original comment by joelgwebber on 8 Apr 2010 at 9:30

GoogleCodeExporter commented 8 years ago
I still have not fully fixed my setup.... Do you have a screenshot of the 
problem? Does the problem still occur 
when switching off light maps or texture loading?

Original comment by stefan.haustein on 8 Apr 2010 at 10:45

GoogleCodeExporter commented 8 years ago
If the quake2 code manages to somehow break the default back buffer (such as by 
issuing a texImage2D call 
with no bound texture) then that could cause this error to be reported. Come to 
think of it, that is probably 
what is going on, though I thought that the WebGL code should catch this. When 
WebGL switches to using the 
command buffer code's default back buffer this will be guarded against more 
securely.

It is fairly straightforward to debug the command buffer code. On Mac or Linux, 
run with --gpu-startup-
dialog and attach to the GPU process when it is launched. (Must run Chrome from 
the command line to see the 
PID.) On Windows I haven't had much luck with the debugging flags and usually 
added a 20 second delay to 
the startup of the GPU process, which is awful. At that point you can debug the 
service side code.

Original comment by kbr@chromium.org on 8 Apr 2010 at 10:46

GoogleCodeExporter commented 8 years ago
Still digging (haven't had time today to get Chrome debugging -- will try over 
the weekend).

So far, I've tried disabling just about every texture call, and I'm still 
getting INVALID_FRAMEBUFFER_OPERATION on 
the first call to glClear(DEPTH_BUFFER_BIT). I don't see anything wrong with 
that call, and if I remove glClear() it 
just pushes the error further out. More to come...

Original comment by joelgwebber on 9 Apr 2010 at 9:59

GoogleCodeExporter commented 8 years ago
It sounds like the WebGL internal frame buffer object is messed up even before 
getting into user level code. Are 
fixes for any other outstanding issues already checked into the quake2-gwt-port 
directory? I can take an update 
here and try to debug further.

Original comment by kbr@chromium.org on 9 Apr 2010 at 10:14

GoogleCodeExporter commented 8 years ago
I just pushed a series of changes that eliminate most of the webgl errors 
before all the INVALID_FRAMEBUFFER_OPERATION stuff shows 
up:
- Removed glEnable/Disable(GL_ALPHA_TEST) calls
- Changed internal/texel formats to be the same in calls to glTexImage2D()
- glEnable(GL_TEXTURE_2D) in the adapter no longer actually calls glEnable()
- Changed all min/mag filter tex parameters to be non-mipmap

This still leaves a couple of errors when running under the sandbox -- at 
times, it doesn't seem to like:
- texParameterf(TEXTURE_2D, TEXTURE_MIN_FILTER, LINEAR)
- texParameterf(TEXTURE_2D, TEXTURE_WRAP_S, REPEAT)

Disabling these makes no difference, however, and also seems to break 
background rendering, so I left them in for now.

Strangely, I'm also finding that I can't even run the following simple example 
under the sandbox (on mac):
  http://learningwebgl.com/lessons/lesson01/index.html
It renders nothing, and I get the following on the console:
  [webgraphicscontext3d_command_buffer_impl.cc(163)]
  Not implemented reached in virtual void WebGraphicsContext3DCommandBufferImpl::reshape(int, int)

Original comment by joelgwebber on 10 Apr 2010 at 2:31

GoogleCodeExporter commented 8 years ago
I can't reproduce the failure with 
http://learningwebgl.com/lessons/lesson01/index.html on Mac OS X with the 
top of tree Chromium. Tested on both Mac OS X 10.5 and 10.6 with download from 
http://build.chromium.org/buildbot/continuous/mac/LATEST/ as well as with a 
debug build on 10.6.

Original comment by kbr@chromium.org on 15 Apr 2010 at 1:17

GoogleCodeExporter commented 8 years ago
Hmm... I've just updated my machine (MBPro), rebooted, and updated Chromium 
from 44190 to 44643 (latest as 
of Apr 15), and I still can't get the WebGL sample to work (I just get a white 
rectangle). It works on my Linux 
desktop, though, so I'll continue debugging there for now.

Original comment by joelgwebber on 15 Apr 2010 at 2:57

GoogleCodeExporter commented 8 years ago
A bit closer. Don't know what's wrong with my Mac, but I'm able to debug it 
fine on my Linux box. It's fairly close to 
working properly now. If I run from the current repo with no changes, 
everything works except the menus (which render 
blank).

The remaining gl errors are for:
- Various GL point extension stuff, which I will remove.
- texParameter(MIN_FILTER, LINEAR) -- not clear why this would ever be an error 
(MIP is never specified)
- texParameter(WRAP, REPEAT) -- this should only be called on pow-2 textures, 
but changing it to CLAMP doesn't help.

I think the above problems might cropping up because the texture is not bound 
properly when they're called -- I'll have to 
investigate that further.

Original comment by joelgwebber on 16 Apr 2010 at 1:01

GoogleCodeExporter commented 8 years ago
Chrome did not like texture sizes that are not power of 2. Made sure all 
textures are resized to 2^n (w=h for 
mipmap textures), enabled mipmaps.

Original comment by stefan.haustein on 13 May 2010 at 3:23