DLR-RM / AugmentedAutoencoder

Official Code: Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
MIT License
339 stars 97 forks source link

[question] would it be possible to run in a docker container? #19

Closed flugenheimer closed 5 years ago

flugenheimer commented 5 years ago

I was wondering if this would be possible to run in a docker container by enabling gui in the container: http://wiki.ros.org/docker/Tutorials/GUI

I have played a bit with it, but I am running into some issues - not sure if they are related to being in a docker container or not.

This is the output from running the ae_train.py 128 128 3 [[8, 8], [16, 16], [32, 32], [64, 64]] (?, 128, 128, 3) (?, 128, 128, 3) libGL error: No matching fbConfigs or visuals found | 0 / 20000 ETA: --:--:-- libGL error: failed to load driver: swrast Traceback (most recent call last): File "ae_train.py", line 160, in main() File "ae_train.py", line 90, in main dataset.get_training_images(dataset_path, args) File "/app/AugmentedAutoencoder-master/auto_pose/ae/dataset.py", line 93, in get_training_images self.render_training_images() File "/app/AugmentedAutoencoder-master/auto_pose/ae/dataset.py", line 245, in render_training_images bgr_x, depth_x = self.renderer.render( File "/app/AugmentedAutoencoder-master/auto_pose/ae/utils.py", line 15, in decorator setattr(self, attribute, function(self)) File "/app/AugmentedAutoencoder-master/auto_pose/ae/dataset.py", line 75, in renderer float(self._kw['vertex_scale']) File "/app/AugmentedAutoencoder-master/auto_pose/ae/meshrenderer/meshrenderer_phong.py", line 22, in init self._fbo = gu.Framebuffer( { GL_COLOR_ATTACHMENT0: gu.Texture(GL_TEXTURE_2D, 1, GL_RGB8, W, H), File "/app/AugmentedAutoencoder-master/auto_pose/ae/meshrenderer/gl_utils/texture.py", line 11, in init glCreateTextures(tex_type, len(self.id), self.id) File "src/latebind.pyx", line 32, in OpenGL_accelerate.latebind.LateBind.call File "src/wrapper.pyx", line 311, in OpenGL_accelerate.wrapper.Wrapper.call File "/root/.local/lib/python2.7/site-packages/OpenGL/platform/baseplatform.py", line 414, in call self.name, self.name, OpenGL.error.NullFunctionError: Attempt to call an undefined function glCreateTextures, check for bool(glCreateTextures) before calling

MartinSmeyer commented 5 years ago

glcreatetextures is defined since Opengl 4.5

Have you installed a recent Pyopengl version? See #2 for more info.

flugenheimer commented 5 years ago

yes I tried that, but it did not work.

not sure if the problem is relating to the lines: libGL error: No matching fbConfigs or visuals found | 0 / 20000 ETA: --:--:-- libGL error: failed to load driver: swrast

im going to try to reinstall the nvidia-390 driver. as someone has suggested this as a solution for the swrast issue. I will let you know if it works.

If you have any other suggestions please let me know.

flugenheimer commented 5 years ago

reinstalling the driver and rebooting did not help either.

MartinSmeyer commented 5 years ago

Have you tried to use nvidia-docker?

flugenheimer commented 5 years ago

yes i used nvidia docker 2, and i am able to see the GPU using nvidia-smi. Everything else is installed as when i am not running docker.

MartinSmeyer commented 5 years ago

Unfortunately, I don't have time to make the code compatible for everyone. But I would be happy to point to a docker container from the repo if anybody succeeds on that.

sinnis1991 commented 5 years ago

@MartinSmeyer I also have this problem, but I was not running it via docker. I think it is because you were using an old version of opengl, so I suggest posting your opengl version like: 3.0.1b1, 3.0.1b2, 3.0.1, 3.0.2a1, 3.0.2b2, 3.0.2, 3.1.0a1, 3.1.0a3, 3.1.0b1, 3.1.0b2, 3.1.0b3, 3.1.0, 3.1.1a1, 3.1.3b1.

sinnis1991 commented 5 years ago

@flugenheimer I got the exactly same problem but I was running the code in ubuntu environment using a python virtual environment without docker. I wonder if you have solved this problem.

flugenheimer commented 5 years ago

@sinnis1991 I have not had the time to look more into making this work in docker. At the moment I have just accepted that it has to run in Ubuntu.

I have looked a bit at different renderers for python and the pyrender here seems promising as it supports headless gpu or cpu rendering. This would potentially work in docker: https://github.com/mmatl/pyrender

You would have to rewrite some of the scripts for the augmented autoencoder to make the whole pipeline work with a new renderer

MartinSmeyer commented 5 years ago

Update: Whether or not headless rendering works depends on the OpenGL Context. The previously used GLFW does still not support headless rendering. EGL does but is not running out-of-the-box. However @wangg12 pointed out that with a small change to PyOpenGL we can make EGL contexts work. The code is now updated and you can train without a display connected using the EGL context. It might also make it possible to run in a docker image.

MartinSmeyer commented 5 years ago

Before running ae_train, do:

export PYOPENGL_PLATFORM='egl'
saqib1707 commented 3 years ago

@MartinSmeyer @flugenheimer I am trying to run this inside a docker container (with ubuntu 18.04) on a CentOS machine. I am able to access the NVIDIA-GPU features inside the container. Although the EGL context does not work inside the docker container. Following is the traceback:

Traceback (most recent call last):
  File "/pyenvs/pypkgs/bin/ae_train", line 8, in <module>
    sys.exit(main())
  File "/pyenvs/pypkgs/lib/python3.6/site-packages/auto_pose/ae/ae_train.py", line 96, in main
    dataset.get_training_images(dataset_path, args)
  File "/pyenvs/pypkgs/lib/python3.6/site-packages/auto_pose/ae/dataset.py", line 96, in get_training_images
    self.render_training_images()
  File "/pyenvs/pypkgs/lib/python3.6/site-packages/auto_pose/ae/dataset.py", line 248, in render_training_images
    bgr_x, depth_x = self.renderer.render(
  File "/pyenvs/pypkgs/lib/python3.6/site-packages/auto_pose/ae/utils.py", line 15, in decorator
    setattr(self, attribute, function(self))
  File "/pyenvs/pypkgs/lib/python3.6/site-packages/auto_pose/ae/dataset.py", line 70, in renderer
    float(self._kw['vertex_scale'])
  File "/pyenvs/pypkgs/lib/python3.6/site-packages/auto_pose/meshrenderer/meshrenderer.py", line 20, in __init__
    self._context = gu.OffscreenContext()
  File "/pyenvs/pypkgs/lib/python3.6/site-packages/auto_pose/meshrenderer/gl_utils/egl_offscreen_context.py", line 76, in __init__
    EGL_NO_CONTEXT, context_attributes
  File "/pyenvs/pypkgs/lib/python3.6/site-packages/OpenGL/platform/baseplatform.py", line 415, in __call__
    return self( *args, **named )
  File "src/errorchecker.pyx", line 58, in OpenGL_accelerate.errorchecker._ErrorChecker.glCheckError
OpenGL.raw.EGL._errors.EGLError: EGLError(
    err = EGL_BAD_MATCH,
    baseOperation = eglCreateContext,
    cArguments = (
        <OpenGL._opaque.EGLDisplay_pointer object at 0x7fbc25688b70>,
        <OpenGL._opaque.EGLConfig_pointer object at 0x7fbc25688ae8>,
        <OpenGL._opaque.EGLContext_pointer object at 0x7fbc2682ebf8>,
        <OpenGL.arrays.lists.c_int_Array_7 object at 0x7fbcf5d030d0>,
    ),
    result = <OpenGL._opaque.EGLContext_pointer object at 0x7fbc2568b488>
)

Has anyone found any solution to this ? TIA.