Docker image broken - Githubissues

dreamflasher commented 2 years ago

This Dockerfile https://github.com/microsoft/scene_graph_benchmark/blob/main/docker/Dockerfile has the relevant lines of installing the repo commented out.

Uncommenting them and building the images results in:

/miniconda/envs/py37/lib/python3.7/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at  /opt/conda/conda-bld/pytorch_1607370141920/work/c10/cuda/CUDAFunctions.cpp:100.)
  return torch._C._cuda_getDeviceCount() > 0
Traceback (most recent call last):
  File "setup.py", line 68, in <module>
    cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension},
  File "/miniconda/envs/py37/lib/python3.7/site-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/miniconda/envs/py37/lib/python3.7/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/miniconda/envs/py37/lib/python3.7/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/miniconda/envs/py37/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/miniconda/envs/py37/lib/python3.7/distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/miniconda/envs/py37/lib/python3.7/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/miniconda/envs/py37/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/miniconda/envs/py37/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
    _build_ext.build_ext.run(self)
  File "/miniconda/envs/py37/lib/python3.7/distutils/command/build_ext.py", line 340, in run
    self.build_extensions()
  File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 670, in build_extensions
    build_ext.build_extensions(self)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
    _build_ext.build_ext.build_extensions(self)
  File "/miniconda/envs/py37/lib/python3.7/distutils/command/build_ext.py", line 449, in build_extensions
    self._build_extensions_serial()
  File "/miniconda/envs/py37/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial
    self.build_extension(ext)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
    _build_ext.build_extension(self, ext)
  File "/miniconda/envs/py37/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
    depends=ext.depends)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 486, in unix_wrap_ninja_compile
    cuda_post_cflags = unix_cuda_flags(cuda_post_cflags)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 395, in unix_cuda_flags
    cflags + _get_cuda_arch_flags(cflags) +
  File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1424, in _get_cuda_arch_flags
    capability = torch.cuda.get_device_capability()
  File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/cuda/__init__.py", line 291, in get_device_capability
    prop = get_device_properties(device)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/cuda/__init__.py", line 296, in get_device_properties
    _lazy_init()  # will define _get_device_properties
  File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/cuda/__init__.py", line 172, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

dreamflasher commented 2 years ago

This command can be used to install the library after building the image: sudo docker run --rm --gpus all -it --entrypoint bash scene_graph_benchmark

dreamflasher commented 2 years ago

But then running the code fails with:

root@1758423ba0b9:/scene_graph_benchmark# python tools/demo/demo_image.py --config_file sgg_configs/vgattr/vinvl_x152c4.yaml --img_file demo/woman_fish.jpg --save_file output/woman_fish_x152c4.obj.jpg MODEL.WEIGHT pretrained_model/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 TEST.IGNORE_BOX_REGRESSION False
Traceback (most recent call last):
  File "tools/demo/demo_image.py", line 3, in <module>
    import cv2
  File "/miniconda/envs/py37/lib/python3.7/site-packages/cv2/__init__.py", line 8, in <module>
    from .cv2 import *
ImportError: libGL.so.1: cannot open shared object file: No such file or directory

dreamflasher commented 2 years ago

The Dockerfile needs: RUN apt-get install ffmpeg libsm6 libxext6 -y

albertmundu commented 2 years ago

But then running the code fails with:

root@1758423ba0b9:/scene_graph_benchmark# python tools/demo/demo_image.py --config_file sgg_configs/vgattr/vinvl_x152c4.yaml --img_file demo/woman_fish.jpg --save_file output/woman_fish_x152c4.obj.jpg MODEL.WEIGHT pretrained_model/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 TEST.IGNORE_BOX_REGRESSION False
Traceback (most recent call last):
  File "tools/demo/demo_image.py", line 3, in <module>
    import cv2
  File "/miniconda/envs/py37/lib/python3.7/site-packages/cv2/__init__.py", line 8, in <module>
    from .cv2 import *
ImportError: libGL.so.1: cannot open shared object file: No such file or directory

Need to install libgl-dev package

dreamflasher commented 2 years ago

Yeah it needs ˋˋˋ apt-get install ffmpeg libsm6 libxext6 -y

git clone https://github.com/microsoft/scene_graph_benchmark.git && cd scene_graph_benchmark && python setup.py build develop

pip install torchvision==0.8.2+cu101 --find-links=https://download.pytorch.org/whl/torch_stable.html

mkdir pretrained_model && cd pretrained_model wget https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth cd ..

mkdir visualgenome && cd visualgenome wget https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json cd .. ˋˋˋ

I'm working on a PR, but so far I wasn't able to build the library from the Docker build command, because it's missing the gpu then.

iamzifei commented 2 years ago

Yeah it needs ˋˋˋ apt-get install ffmpeg libsm6 libxext6 -y

git clone https://github.com/microsoft/scene_graph_benchmark.git && cd scene_graph_benchmark && python setup.py build develop

pip install torchvision==0.8.2+cu101 --find-links=https://download.pytorch.org/whl/torch_stable.html

mkdir pretrained_model && cd pretrained_model wget https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth cd ..

mkdir visualgenome && cd visualgenome wget https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json cd .. ˋˋˋ

I'm working on a PR, but so far I wasn't able to build the library from the Docker build command, because it's missing the gpu then.

I'm able to build the docker image without commenting out the bottom of the docker file, but I cannot clone the repo and build it manually inside the docker container afterwards. It has " Error compiling objects for extension"

If I include the git clone and build in the docker file, then I get no CUDA driver error when building the image.

I'm wondering have you solved the issue and successfully run the docker instance? If so, could you share some details please?

microsoft / scene_graph_benchmark

Docker image broken #69