NVlabs / nvdiffrast

Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering
Other
1.35k stars 144 forks source link

RasterizeGLContext return killed #40

Closed haoxurt closed 2 years ago

haoxurt commented 3 years ago

Hi, my environment is ubuntu18.04,cuda10.1, nvidia driver 430.46 and torch1.6. I encountered the error killed in RasterizeGLContext class? I have install all dependence such as libglvnd0. Is it because of the low version of nvidia driver?

s-laine commented 3 years ago

Hi @haoxurt, from this information I cannot say much about what might be wrong. Can you try adding dr.set_log_level(0) to the beginning of our program to get more debug output, and paste the output here?

Have you tried running the samples in the provided Docker environment? If they don't run there, that would certainly point to a problem with the graphics driver.

TimmmYang commented 3 years ago

Hi, I have similar problem with RasterizeGLContext class. I got Segmention fault when I use this class. I tried adding dr.set_log_level(0) in the begining of the program and I got:

[I glutil.cpp:322] Creating GL context for Cuda device 0
[I glutil.cpp:370] EGL 5.1 OpenGL context created (disp: 0x000055747c630be0, ctx: 0x000055747c645781)
[I rasterize.cpp:103] OpenGL version reported as 4.6

My environment is Ubuntu 18.04, CUDA 11.4, Driver Version 470.57.02, torch 1.6.0.

s-laine commented 3 years ago

Hi @TimmmYang, and sorry for the delay. Is this inside the Docker environment or in some other configuration? I cannot reproduce this so I cannot debug it, but it looks like the context is successfully created which makes this really puzzling. It would be nice to know where the crash occurs, so if you're adventurous you could try pinpointing the failing function call by adding logging outputs in rasterize.cpp after the point where the OpenGL version is reported. Depending on which call exactly causes the segmentation fault, it may be possible to find a workaround.

duyguceylan commented 2 years ago

I have a similar problem which happens when I try to create a gl context as a member of a class (things work fine if a context is created at the global level). It seems the crash is happening inside the "RasterizeGLStateWrapper()" function.

s-laine commented 2 years ago

This sounds really strange. Would you be able to give a minimal repro for this so I can root cause what fails and where?

duyguceylan commented 2 years ago

If I do this:

class MeshRenderer():
    """
    Parameters:
    """

    def __init__(self, device):
        super().__init__(device)
        dr.set_log_level(0)
        self.glcontex = dr.RasterizeGLContext()

I get a crash. If I move the line glcontex = dr.RasterizeGLContext() outside the class and make glcontex a global variable things work fine.

s-laine commented 2 years ago

This is not a complete program and does not illustrate the bug. For reference, the following works without errors for me:

import nvdiffrast.torch as dr
class MeshRenderer():
    """
    Parameters:
    """

    def __init__(self, device):
        #super().__init__(device)
        dr.set_log_level(0)
        self.glcontex = dr.RasterizeGLContext()

a = MeshRenderer('cuda:0')

Note I had to comment out the super() initialization as there is no parent class that would accept a parameter in a constructor.