ntoronto / pict3d

3-dimensional picts
GNU Lesser General Public License v3.0
113 stars 23 forks source link

win7 glGetString issue #9

Closed pafalium closed 9 years ago

pafalium commented 9 years ago

I'm trying to run pict3d's tests on windows 7 but I'm getting either "could not obtain at least an OpenGL 30 context (pict3d-legacy-contexts? #f)" or "OpenGL procedure not available: glGetString" errors. I've tried running spaceship.rkt, depth-range-rkt, triangle-surface.rkt.

Log messages:

pict3d: pict3d: exception querying canvas OpenGL context version (pict3d-legacy-contexts? #f): (exn:fail "OpenGL procedure not available: glGetString" #)

I'm using pict3d's version from "31/01/2015, 20:26:49".

Tried this code in pict3d/private/gl/untyped-context.rkt just to check that I could reproduce it:

;; from (get-master-gl-context/frame legacy?)
(define config (new gl-config%))
(send config set-legacy? #f)
(define frame (new frame%
                 [label "Master GL context frame"]
                 [width   master-context-max-width]
                 [height  master-context-max-height]
                 [min-width   master-context-max-width]
                 [min-height  master-context-max-height]
                 [stretchable-width #f]
                 [stretchable-height #f]))
(define canvas (new canvas% [parent frame] [style '(gl no-autoclear)] [gl-config config]))
(send frame show #t)
(send frame show #f)

(send canvas with-gl-context (thunk (gl-version)))
; Got an "OpenGL procedure not available: glGetString" error here

I tried using RacketGL recently so I thought about trying to run a similar code snipet using it. Everything goes fine. Here's the code:

(define config (new gl-config%))
(define frame (new frame% 
                   [label "racketGL minimal example"]
                   [width 4096] 
                   [height 4096]
                   [min-width 4096]
                   [min-height 4096]
                   [stretchable-width #f]
                   [stretchable-height #f]))
(define canvas (new canvas% [parent frame] [style '(gl no-autoclear)] [gl-config config]))
(send frame show #t)
(send frame show #f)

(send canvas with-gl-context (thunk (gl-version)))
;; The returned value was '(4 4 12874)
ntoronto commented 9 years ago

It looks like the salient difference is that the first example has (send config set-legacy? #f) but the second doesn't. Can you try the following program?

#lang racket
(require pict3d)
(pict3d-legacy-contexts? #t)
(sphere '(0 0 0) 1/2)

I'm not sure what the permanent fix should be if it works, though.

pafalium commented 9 years ago

I tried the code but I continue to get the same error.

Tried to change opengl/untyped.rkt to resemble current racketgl's rgl.rkt with no luck (swapped the files... noob tinkering :s ).

Tried to add (send config set-legacy? #f) to the racketGL code. It continued working. I also checked what glGetString was returning. It returned "4.4.12874 Compatibility Profile Context 14.100.0.0" with both legacy? values.

I'm going to try to call other gl* and see happens... The result was "OpenGL procedure not available: gl*" (I tried glClearColor and glViewport).

ntoronto commented 9 years ago

Oh, I see what part of the problem is. Racket's OpenGL on Windows ignores the legacy? flag. Still, with an OpenGL 4.4 compatibility context, you should definitely have glViewport.

Do you use OpenGL for anything else on that computer?

There might be a problem with how RacketGL and typed/opengl get API bindings. Can you try the following program?

#lang racket

(require racket/gui
         sgl/gl)

(define frame (new frame% [label "Test"] [width 400] [height 400]))
(define canvas (new canvas% [parent frame] [style '(gl no-autoclear)]))
(send frame show #t)

(define ctxt (send (send canvas get-dc) get-gl-context))
(send ctxt call-as-current
      (λ () (with-handlers ([exn?  (λ (e) e)])
              (glGetString GL_VERSION))))

(send frame show #f)

It outputs "3.0 Mesa 10.1.3" on my system.

Feel free to submit a Racket bug report about the legacy? flag being ignored. Windows OpenGL implementations tend to be good about returning high-version compatibility contexts (unlike Mac, which returns only version 2.1). But we should be able to get core contexts when we want them.

pafalium commented 9 years ago

The output was "4.4.12874 Compatibility Profile Context 14.100.0.0".

Do you use OpenGL for anything else on that computer?

Yes. Some of them were some school assignments. They were written in C/C++ and used freeGLUT + GLEW. They were setup to create a OpenGL 3.3 Core Profile.

There might be a problem with how RacketGL and typed/opengl get API bindings.

I think get what you are saying. All canvases come from racket/gui so I'm assuming the contexts are all being created in the same way. The only thing that is changing is the way to get OpenGL bindings. (racketGL, typed/opengl, sgl)

ntoronto commented 9 years ago

That's right. I'm trying to determine whether the sgl/glmodule does something significantly differently than typed/opengl (which is supplied by the "pict3d" package).

What happens when you replace (glGetString GL_VERSION) with something else, such as (glViewport 0 0 400 400) or (glClearColor 0 0 0 0)? They should just return (void) (i.e. nothing is printed in the REPL).

pafalium commented 9 years ago

They both return (void) as expected.

ntoronto commented 9 years ago

Looks like typed/opengl may not be getting OpenGL procedures lazily enough.

Just to be doubly sure: if you replace sgl/gl with typed/opengl, then the program always returns an exception, right?

pafalium commented 9 years ago

Yep, after replacing sgl/gl with typed/opengl, the program returns always returns an exception. Tested with glGetString, glViewport and glClearColor. The exception being (exn:fail "OpenGL procedure not available: gl*" #<continuation-mark-set>).

ghost commented 9 years ago

Note that pafa is not the only one having this issue, it seems. I am, for the time being, running a Windows 8 machine. My return for the program sent to pafa before was "4.4.0", and it always gives the same exceptions with typed/opengl but not sgl/gl as well. I cannot be nearly as helpful as him, but am willing to give information if it would help identify the issue.

If you'd like, I could see if the same issue happens on Debian (on the same machine) or some other distro, being that I haven't had a chance to install Linux on this machine yet. Of course, I'd just do FreeBSD, but it isn't dual-boot friendly, and I must always have one Windows dual-boot machine...

ntoronto commented 9 years ago

I think I found out what the problem is. On Windows, OpenGL <= 1.1 functions such as glGetString, glViewport, etc., are supposed to be loaded using GetProcAddress, and everything else is supposed to be loaded using wglGetProcAddress. Further, wglGetProcAddress MUST return NULL if asked for an OpenGL <= 1.1 function.

But wglGetProcAddress just asks the graphics driver for the functions. The driver on my Windows test machine apparently returns OpenGL <= 1.1 functions if asked for them, which is why I haven't seen this error.

ntoronto commented 9 years ago

@pafalium, @Pastaf: Can you try it now?

pafalium commented 9 years ago

I've already tried it.

Now it works. I tried both the program from earlier and some examples from pict3d/tests.

cull cull.rkt: It seems that the frustum culling is only leaving the lower left quadrant "unculled". Is this the intended result?

depth-range

fractal-trees

spaceship spaceship.rkt: As I moved the camera some of the glowing objects disappeared and reappeared.

spheres-on-canvas spheres-on-canvas.rkt: I tested it on another win7 machine with similar results.

triangle-surface

ghost commented 9 years ago

Haven't had time to check much, but seems to work so far. If something fails, I'll let you know once I have a chance. Right now though, I have to go.

ntoronto commented 9 years ago

Thanks a ton, guys.

@pafalium: Thanks for the screenshots! What graphics card do you have?

The triangle surface and trees look right.

Obviously spheres-on-canvas is messed up badly. What happens when you resize the window?

Can you please evaluate (run-anim) in the REPL after running the spaceship test and report whether it has the same problems as spheres-on-canvas?

The depth-range test should look like this: depth-range

The frustum culling test should look like this (though I've put the picts in a list to make them all visible at once): frustum-culling

The spaceship test should look like this: spaceship

It looks like everything that's messed up has to do with drawing impostors: shapes that are sent to the graphics card as viewer-facing rectangles, and then ray-traced by a fragment shader. Ellipsoids and point lights are drawn that way. For example, the spaceship is missing two ellipsoids on the fuselage and left wing, and also missing a bunch of tiny point lights along the struts that join the wings to the fuselage. (The point lights' effect on the fuselage is slight, so it's hard to see.)

What happens in the frustum culling tests when you click on the picts and use mouse+WASD to look around?

Between these clues and knowing your graphics card, I might be able to figure out what's wrong with my shaders.

pafalium commented 9 years ago

Thanks for the screenshots! What graphics card do you have?

I have an amd hd7970m on the system I use the most. I also have an amd hd4770 on other system. I'll try to use the integrated graphics to see what happens.

Obviously spheres-on-canvas is messed up badly. What happens when you resize the window? I started the example with different frame sizes. The area that is correct is resized along with the window.

Here are the results: spheres-grid

Can you please evaluate (run-anim) in the REPL after running the spaceship test and report whether it has the same problems as spheres-on-canvas?

I get this: run-anim Only an area is being successfully raytraced?

The frustum culling test should look like this.

I guessed so. :)

For example, the spaceship is missing two ellipsoids on the fuselage and left wing, and also missing a bunch of tiny point lights along the struts that join the wings to the fuselage.

The ellipsoids only disappeared when I changed the camera. The initial view was the same as yours.

What happens in the frustum culling tests when you click on the picts and use mouse+WASD to look around?

It looks like only the spheres that touch lower left are drawn. cull_2

I'm trying the integrated graphics next.

pafalium commented 9 years ago

Quick update: All the tests get the correct result when using the integrated graphics (intel core i7).

ntoronto commented 9 years ago

I found it. It looks like AMD's on-chip floating-point isn't standards-compliant. (That's not terribly surprising.) The bounds for impostors are computed starting this way:

  // view space min and max
  vec3 vmin = vec3(+1.0 / 0.0); // 32-bit +Inf
  vec3 vmax = vec3(-1.0 / 0.0); // 32-bit -Inf

  // clip space min and max
  vec3 cmin = vec3(+1.0 / 0.0);
  vec3 cmax = vec3(-1.0 / 0.0);

If I change -1.0 in that shader to +1.0, I can duplicate your results on the frustum culling test exactly. Apparently, on your AMD card, -1.0/0.0 is evaluated as +inf.

As soon as I find a portable way to get positive and negative infinity values, I'll fix this.

ntoronto commented 9 years ago

FWIW, I have no idea whether this will fix the corruption in the spheres-on-canvas test. I couldn't duplicate the pretty colors you're getting on that. I have gotten corruption on window resizing that I've been meaning to track down, though.

ntoronto commented 9 years ago

OK, I think that's fixed it. On to the corruption issue...

pafalium commented 9 years ago

Fix confirmed. :) I don't understand how the corruption was happening but now it's gone.

Strangely enough, it seems to be running slower (but interactive speed) when using the hd7970m.

I tried to re-size the window to fill the whole screen to see if there was a performance drop. I didn't notice a difference.

ntoronto commented 9 years ago

Awesome!

The corruption you got was probably weird floating-point artifacts that I couldn't reproduce just by using +1.0/0.0 instead of -1.0/0.0. The corruption I still get has to do with reading uninitialized parts of a bloom buffer.

It looks like your card is good at running complicated fragment shaders. I'll bet your i7 struggles a bit when you maximize the window. Mine does.

OpenGL calls on Windows machines currently use wglGetProcAddress to get a function pointer on every call. They do that because the actual function pointer can be different depending on the GL context: its version, its pixel format, etc. I think I can speed up API calls by caching function pointers in a table keyed on the GL context and function name. I'll get to it later today.

pafalium commented 9 years ago

Yes, the i7 struggles when the window is maximized.

For comparison, the minimum (when resizing the frame to smaller sizes) real time: profile output is around 30 when running on the i7 and around 60 when on the hd7970m.

ntoronto commented 9 years ago

I've just pushed a change that might speed things up significantly on Windows.

I simulated the Windows loading protocol on my Linux machine and found that each call to glPrimitiveRestartIndex took 0.01ms. A delay like that allows at most 1667 API calls per frame at 60 FPS. Caching function pointer lookups by context brought the time down to 0.00065ms, or 25600 calls per frame at 60 FPS. My Linux machine's slimmer loading protocol allows 66 thousand calls per frame, but if I need more than 25 thousand API calls, I'm using OpenGL wrongly anyway.

The latest changes also require all OpenGL API calls to be made while an OpenGL context is active, just like on Windows, regardless of OS, to help keep my code portable in the future. It hasn't caused any problems yet.

pafalium commented 9 years ago

I don't know if there was a speedup in my case. The performance continues to be higher when running on the i7. But again, I don't think it is the graphics card problem as I continue to observe constant speed when maximizing the window with the hd7970m. Maybe the issue is related to the driver.

ntoronto commented 9 years ago

That test was intended to saturate modern AGP buses. The graphics chip in your i7 is integrated, so it's less of an issue on that system.

So my best guess now that API latency is taken care of is that it's just too much to pump through the pipe. Try reducing the number of spheres. If you get a dramatic speedup by reducing the number a small amount, it's probably a memory bandwidth issue.

I do intend to reduce the amount of memory bandwidth needed. About half of it is transformation matrices, which I think I can reduce to 1/4 the size using instancing.

pafalium commented 9 years ago

I reduced the number of spheres to 7000 and the real time: has decreased to fluctuate inside 16-30 while the window was maximized. All right, it's a memory bandwidth issue.

ntoronto commented 9 years ago

FWIW, memory bandwidth is something I can improve, and I'll be working on it.

For now, I'm going to close this issue. Thanks for all your help!

pafalium commented 9 years ago

Glad I could help. :)

ntoronto commented 9 years ago

I've reduced memory bandwidth for drawing spheres by more than 75% using geometry shaders, when OpenGL version 3.2 or greater is detected. I've gotten my i7 up to 280,000 spheres at 60 FPS. (Though in a tiny window to reduce fragment shader cost. :D) Can you try "spheres-on-canvas.rkt" again on your hd7970m?

ghost commented 9 years ago

I also got up to 280,000 spheres, fully maximized, before it started to fall off.

ntoronto commented 9 years ago

Awesome, thanks!

pafalium commented 9 years ago

I tried "spheres-on-canvas.rkt" and got these results:

So, same here.