shunsukesaito / PIFu

This repository contains the code for the paper "PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization"
https://shunsukesaito.github.io/PIFu/
Other
1.76k stars 341 forks source link

Headless rendering issue #49

Closed RohanChacko closed 3 years ago

RohanChacko commented 4 years ago

Hi I am running the data generation code (render_data.py) on a slurm-managed cluster. I get the following error:

File "/.../PIFu/lib/renderer/gl/glcontext.py", line 110, in create_opengl_context
    egl.eglInitialize(egl_display, pointer(major), pointer(minor))
  File "/.../lib/python3.8/site-packages/OpenGL/platform/baseplatform.py", line 415, in __call__
    return self( *args, **named )
  File "/.../lib/python3.8/site-packages/OpenGL/error.py", line 230, in glCheckError
    raise self._errorClass(
OpenGL.raw.EGL._errors.EGLError: EGLError(
        err = EGL_BAD_ACCESS,
        baseOperation = eglInitialize,
        cArguments = (
                <OpenGL._opaque.EGLDisplay_pointer object at 0x14b10fbd64c0>,
                <OpenGL.arrays.arraydatatype.LP_c_int object at 0x14b10cdf6f40>,
                <OpenGL.arrays.arraydatatype.LP_c_int object at 0x14b10cc44140>,
        ),
        result = 0
)

The code runs sometimes but mostly errors as above. Is there any known reason for this? Digging a bit into PyOpenGL, I found that it tries to access /dev/dri/renderD*. Is there a way to specify which egl_display to use? Could it be caused due to the GPUs being shared across jobs?

Other info

PyOpenGL ver - 3.1.5
Ubuntu 16.04.3
NVIDIA GTX 1080 Ti
NVIDIA Driver Version 440.95.01
CUDA Version: 10.2

[UPDATE]

The following code seems to resolve the issue:

from OpenGL import error
from OpenGL.EGL.EXT.device_base import *
from OpenGL.raw.EGL.EXT.platform_device import EGL_PLATFORM_DEVICE_EXT

def create_initialized_headless_egl_display():
  """Creates an initialized EGL display directly on a device."""
  for device in egl_get_devices():
    display = egl.eglGetPlatformDisplayEXT(EGL_PLATFORM_DEVICE_EXT, device, None)

    if display != egl.EGL_NO_DISPLAY and egl.eglGetError() == egl.EGL_SUCCESS:
      # `eglInitialize` may or may not raise an exception on failure depending
      # on how PyOpenGL is configured. We therefore catch a `GLError` and also
      # manually check the output of `eglGetError()` here.
      try:
        initialized = egl.eglInitialize(display, None, None)
      except error.GLError:
        pass
      else:
        if initialized == egl.EGL_TRUE and egl.eglGetError() == egl.EGL_SUCCESS:
          return display
  return egl.EGL_NO_DISPLAY
egl_display = create_initialized_headless_egl_display()
if egl_display == egl.EGL_NO_DISPLAY:
  raise ImportError('Cannot initialize a headless EGL display.')

Taken from here which was referenced in this issue

Please let me know if this can work as a permanent fix.

shunsukesaito commented 4 years ago

Can you send a pull request so that I can check if it can be merged?

RohanChacko commented 4 years ago

Submitted a PR #50. Works for PyOpenGL ver 3.1.5. The conda version of the package is older.

RohanChacko commented 3 years ago

Let me know if there is any issue in merging PR with master.

shunsukesaito commented 3 years ago

Just merged the PR. Thanks a lot!!