nigels-com / glew

The OpenGL Extension Wrangler Library
Other
2.62k stars 614 forks source link

Win7/Nvidia/Intel: glewInit() extremely slow #140

Open JPGygax68 opened 7 years ago

JPGygax68 commented 7 years ago

Our code was working fine for years, now suddenly glewInit() takes between 40 and 50+ seconds.

Under the circumstances, we do not suspect that the fault lies with glew, but we are a bit desperate for a way to solve the problem.

We're running a 32-bit application under Windows 7 64-bit. The problem occurs only on one specific machine (though there are identical ones that will inherit the same problem sooner or later), which has dual graphics, Intel integrated and NVidia 970M; our application is configured to use the NVidia hardware via the NVidia control panel.

I'd be grateful for any pointers, otherwise I'd have to invest a lot of time into getting glew to compile under Windows in order to pinpoint the time waster more accurately.

EDIT: this may be totally unrelated, but we also experience apparently random failures of glGenBuffers(), which returns 0s instead of valid buffer ids.

nigels-com commented 7 years ago

That sounds quite unusual.

My initial suggestion is to ensure that the drivers are fully updated, and perhaps try beta drivers if they are available. I would expect this to be quite specific to this particular CPU/GPU/WIn7 combo.

JPGygax68 commented 7 years ago

Thanks Nigel! I've tried updating the drivers (both the Intel and NVidia ones), but got no improvement. I could add two more details:

Silly thought: would you consider doing paid consultant work on this ? A build that profiles what is going on inside glewInit() might be very helpful.

nigels-com commented 7 years ago

Easter is a four day weekend in Australia, and I've worked two weekends of the past three. So I'm . a bit reluctant to take on projects at the moment.

I do have a 64-bit Win7 laptop, but not with Nvidia graphics.

Does glewinfo exhibit the same inexplicable delay as your application?

JPGygax68 commented 7 years ago

I haven't tried yet - will do so tomorrow, thanks! (Bedtime here in Switzerland :-) )

On Fri, Apr 14, 2017 at 1:46 AM, Nigel Stewart notifications@github.com wrote:

Easter is a four day weekend in Australia, and I've worked two weekends of the past three. So I'm . a bit reluctant to take on projects at the moment.

I do have a 64-bit Win7 laptop, but not with Nvidia graphics.

Does glewinfo exhibit the same inexplicable delay as your application?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nigels-com/glew/issues/140#issuecomment-294049501, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwiiHqoHir76wSQ28qO8OcKYgZt5qnsks5rvrPQgaJpZM4M7TWR .

nigels-com commented 7 years ago

As a quick easy check, try running glewinfo for some of the recent releases 1.11, 1.12, 1.13, 2.0.

JPGygax68 commented 7 years ago

I just tried visualinfo versions 2.0, 1.13, and 1.11 (in that order). They all have the problem.

On Fri, Apr 14, 2017 at 9:45 AM, Nigel Stewart notifications@github.com wrote:

As a quick easy check, try running glewinfo for some of the recent releases 1.11, 1.12, 1.13, 2.0.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nigels-com/glew/issues/140#issuecomment-294109304, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwiiNqGBrsc9OupENUZ3iGJu6Kf8fKMks5rvyP-gaJpZM4M7TWR .

nigels-com commented 7 years ago

I'd suggest trying some of the WHQL drivers from http://www.geforce.com/drivers

If you can identify working and/or broken versions that would be helpful information for Nvidia to resolve the problem.

JPGygax68 commented 7 years ago

Unfortunately, even the oldest available WHQL driver (375.70) has the problem.

I wonder if the driver really is to blame though. I do not recall Windows auto-updating the display driver (though it's totally possible I have missed it), the change could have come with another, smaller update.

Could 32 vs 64-bit have something to do with it ? The OS is 64-bit, glew32.dll is 32-bit (I tried switching to the 64-bit version, but it won't load).

On Sat, Apr 15, 2017 at 3:18 AM, Nigel Stewart notifications@github.com wrote:

I'd suggest trying some of the WHQL drivers from http://www.geforce.com/drivers

If you can identify working and/or broken versions that would be helpful information for Nvidia to resolve the problem.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nigels-com/glew/issues/140#issuecomment-294263451, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwiiPnJVSqjhei8ZBmvHhX3ktY5NE0tks5rwBrbgaJpZM4M7TWR .

nigels-com commented 7 years ago

Is the application 32-bit only? That does sound a bit unusual nowadays.

JPGygax68 commented 7 years ago

Yes, it is. It's starting to create problems, but so far, they're manageable.

On Tue, Apr 18, 2017 at 1:57 PM, Nigel Stewart notifications@github.com wrote:

Is the application 32-bit only? That does sound a bit unusual nowadays.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nigels-com/glew/issues/140#issuecomment-294806541, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwiiIFTQScSBIBY1p7DPu8hDcRLHwFxks5rxKUPgaJpZM4M7TWR .

JPGygax68 commented 7 years ago

The problem has gone away on its own.

It happened after I moved the notebook onto my desk and connected it to my KVM switch. Best guess: some level of OpenGL got confused because of bad information regarding the monitors. I don't think I could pinpoint it further.

The problem is no longer reproduceable even when I disconnect the external (KVM-switched) monitor.

There still is a delay though, of about 6 seconds, both with my software and when executing glewinfo. Can you confirm that this remaining delay is normal ?

nigels-com commented 7 years ago

Six seconds still sounds exceedingly long. I'll leave this issue open so I can try Win7 with 32-bit glewinfo, although it's Intel graphics, not Nvidia, just as a data point.

JPGygax68 commented 7 years ago

Thank you!

On Sun, Apr 23, 2017 at 3:31 AM, Nigel Stewart notifications@github.com wrote:

Six seconds still sounds exceedingly long. I'll leave this issue open so I can try Win7 with 32-bit glewinfo, although it's Intel graphics, not Nvidia, just as a data point.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nigels-com/glew/issues/140#issuecomment-296413019, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwiiHXKVpQQFVvKS4j5qMZOAqXiPQvjks5ryqnugaJpZM4M7TWR .

nigels-com commented 7 years ago

Win32

$ time ./glewinfo.exe

real    0m0.179s
user    0m0.000s
sys     0m0.000s

x64

$ time ./glewinfo.exe

real    0m0.192s
user    0m0.000s
sys     0m0.015s
$ cat glewinfo.txt
---------------------------
    GLEW Extension Info
---------------------------

GLEW version 2.0.0
Reporting capabilities of pixelformat 3
Running on a Intel(R) HD Graphics 3000 from Intel
OpenGL version 3.1.0 - Build 9.17.10.4229 is supported

GL_VERSION_1_1:                                                OK
---------------

GL_VERSION_1_2:                                                OK
---------------
  glCopyTexSubImage3D:                                         OK
  glDrawRangeElements:                                         OK
  glTexImage3D:                                                OK
  glTexSubImage3D:                                             OK

GL_VERSION_1_2_1:                                              OK
-----------------

GL_VERSION_1_3:                                                OK
---------------
  glActiveTexture:                                             OK
  glClientActiveTexture:                                       OK
  glCompressedTexImage1D:                                      OK
  glCompressedTexImage2D:                                      OK
  glCompressedTexImage3D:                                      OK
  glCompressedTexSubImage1D:                                   OK
  glCompressedTexSubImage2D:                                   OK
  glCompressedTexSubImage3D:                                   OK
  glGetCompressedTexImage:                                     OK
  glLoadTransposeMatrixd:                                      OK
  glLoadTransposeMatrixf:                                      OK
  glMultTransposeMatrixd:                                      OK
JPGygax68 commented 7 years ago

Thank you! So that means we're still a factor of roughly 20-30 times too slow.

On Tue, Apr 25, 2017 at 9:58 AM, Nigel Stewart notifications@github.com wrote:

$ time ./glewinfo.exe

real 0m0.179s user 0m0.000s sys 0m0.000s

$ cat glewinfo.txt

GLEW Extension Info

GLEW version 2.0.0 Reporting capabilities of pixelformat 3 Running on a Intel(R) HD Graphics 3000 from Intel OpenGL version 3.1.0 - Build 9.17.10.4229 is supported

GL_VERSION_1_1: OK

GL_VERSION_1_2: OK

glCopyTexSubImage3D: OK glDrawRangeElements: OK glTexImage3D: OK glTexSubImage3D: OK

GL_VERSION_1_2_1: OK

GL_VERSION_1_3: OK

glActiveTexture: OK glClientActiveTexture: OK glCompressedTexImage1D: OK glCompressedTexImage2D: OK glCompressedTexImage3D: OK glCompressedTexSubImage1D: OK glCompressedTexSubImage2D: OK glCompressedTexSubImage3D: OK glGetCompressedTexImage: OK glLoadTransposeMatrixd: OK glLoadTransposeMatrixf: OK glMultTransposeMatrixd: OK

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nigels-com/glew/issues/140#issuecomment-296950165, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwiiIZ9agIMq49kfkiWqRyIqODdkmKMks5rzaeJgaJpZM4M7TWR .

zwcloud commented 7 years ago

In my application on Win10, glewInit() takes about 1200 ms. I think that's still quite slow.

x64

E:\lib\glew-2.0.0\bin\Release\x64>powershell Measure-Command {start-process .\glewinfo.exe -Wait}

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 5
Milliseconds      : 120
Ticks             : 51202725
TotalDays         : 5.92624131944444E-05
TotalHours        : 0.00142229791666667
TotalMinutes      : 0.085337875
TotalSeconds      : 5.1202725
TotalMilliseconds : 5120.2725

glewinfo.txt

---------------------------
    GLEW Extension Info
---------------------------

GLEW version 2.0.0
Reporting capabilities of pixelformat 1
Running on a GeForce GT 550M/PCIe/SSE2 from NVIDIA Corporation
OpenGL version 4.5.0 NVIDIA 384.76 is supported

GL_VERSION_1_1:                                                OK 
---------------

GL_VERSION_1_2:                                                OK 
---------------
  glCopyTexSubImage3D:                                         OK
  glDrawRangeElements:                                         OK
  glTexImage3D:                                                OK
  glTexSubImage3D:                                             OK

GL_VERSION_1_2_1:                                              OK 
-----------------

GL_VERSION_1_3:                                                OK 
---------------
  glActiveTexture:                                             OK
  glClientActiveTexture:                                       OK
nigels-com commented 7 years ago

Is it a laptop?

zwcloud commented 7 years ago

Yes.

983 commented 6 years ago

I also got the problem of slow glewInit(), but on Windows 8.1 and with 64 bit application. My laptop has Intel and NVIDIA GPU. Initialization is slow for NVIDIA only:

glewInit() took 2.271872 seconds

GL_VERSION: 4.5.0 NVIDIA 385.54
GL_VENDOR: NVIDIA Corporation
GL_RENDERER: GeForce 840M/PCIe/SSE2
GL_SHADING_LANGUAGE_VERSION: 4.50 NVIDIA
glewInit() took 0.000972 seconds

GL_VERSION: 4.3.0 - Build 10.18.14.4264
GL_VENDOR: Intel
GL_RENDERER: Intel(R) HD Graphics 4400
GL_SHADING_LANGUAGE_VERSION: 4.30 - Build 10.18.14.4264

When using the NVIDIA GPU, wglGetProcAddress takes about 1 millisecond on average per call (up to 3 ms) and it is called 2000 times.

Here is some code to reproduce the issue. Runs in less than a second on Intel GPU or over 10 seconds with NVIDIA GPU.

#include <GL/glut.h>
#include <windows.h>
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>

double sec(){
    LARGE_INTEGER t, freq;
    QueryPerformanceFrequency(&freq);
    QueryPerformanceCounter(&t);
    return t.QuadPart / (double)freq.QuadPart;
}

// Enable NVIDIA GPU instead of Intel gpu.
__declspec(dllexport) DWORD NvOptimusEnablement = 0x00000001;

int main(int argc, char **argv){
    glutInit(&argc, argv);
    glutInitDisplayMode(GLUT_RGBA | GLUT_DEPTH | GLUT_DEPTH);
    glutInitWindowSize(640, 480);
    glutCreateWindow("window");

    for (int k = 0; k < 10; k++){

        double t = sec();

        for (int i = 0; i < 1000; i++){
            void *result = wglGetProcAddress("glGenBuffers");
            assert(NULL != result);
        }

        double dt = sec() - t;
        printf("1000 wglGetProcAddress(\"glGenBuffers\"): %f seconds\n", dt);
    }

    return 0;
}
JPGygax68 commented 5 years ago

I'm sorry to come back with this annoying problem, but it seems it came back in full force to our customer who originally reported the problem.

@nigels-com: would your timetable allow you to do some (paid) support work on this now? EDIT: I'm thinking along the lines of a special build that logs timing info e.a. to a predetermined file, so that the customer can just replace the DLL, run the app, and send the log back to us. This would likely take several iterations of course.

nigels-com commented 5 years ago

Ideally we could identify known "good" and known "bad" Nvidia driver versions, ideally on a discrete GPU, but with some additional details of laptop model and hardware revision numbers. I do have some contacts that might help bring it to the attention of driver devs (can't promise anything) but collecting a lot of information would help that, especially if there are known workarounds or known good driver branches.

We have a Windows 10 with GeForce GT1030 setup I can test on, if that's of help.

If you have any offical or informal Nvidia channels, that's probably more promising, but I'd still recommend collecting a lot of detailed diagnostic and hardware information. Have you got a sense of how widespread the problem is?

I guess I could do some paid hours, but I'm not confident I can resolve an issue even having the known problematic laptop and/or GPU. I have a full time job, so my time is limited.

One thing that comes to mind is there is a way (registry key? third-part tool?) to disable the multi-threading optimisations in the Nvidia OpenGL driver. (Or I might be mis-remembering the Quadro driver that is/was single-threaded by default) That's a toggle worth trying, I don't recall whether there is internal logic or heuristics there, such as an auto-mode based on call patterns, etc.

Hope it helps.

JPGygax68 commented 5 years ago

Thank you @nigels-com . The problem seems very specific to these dual-GPU notebooks. I'm trying to get the exact model + driver info.

JPGygax68 commented 5 years ago

@nigels-com: I just noticed that glewInit() on my desktop PC has also become slow, I clocked it at around 10s. Unfortunately, I had no luck building glew from source myself (1.5 tag), make in auto fails with:

head: cannot open 'registry/ATI/texture_env_combine3.txt' for reading: No such file or directory
nigels-com commented 5 years ago

Yeah, the code generation hasn't always been so solid on Windows. Here is a link to the downloads for 1.5: https://sourceforge.net/projects/glew/files/glew/1.5.0/

MomoDeve commented 3 years ago

Also stuck with this problem. glewInit takes around 5 seconds. I also have laptop with dual gpu (nvidia/intel)

seijiwjsmith commented 3 years ago

Yeah, this is a problem for me too. I have a nvidia/intel laptop, glewinit() takes 10 seconds

sandercox commented 6 months ago

We're running integration tests where our small test app needs to setup opengl context perform some actions and quit. When run one at a time glewInit() is fast, but when I have many parallel processes calling glewInit() there are moments that just the glewInit takes 10seconds. This takes a big chunk out of our total test time. Machines have 32cores/64 threads so we test with up to 64 parallel test case runners.

Is there any way this would be optimizable by somehow storing the resuls from glewInit() and instead of checking again and again just loading function pointers at fixed addresses or something similar?