Docker Container does not support RTX4090

JasonIsaac-Lofty commented 11 months ago

When I run docker, following error information appears: NVIDIA Release 22.04 (build 36527063) PyTorch Version 1.12.0a0+bd13bc6

Copyright (c) 2014-2022 Facebook Inc. Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert) Copyright (c) 2012-2014 Deepmind Technologies (Koray Kavukcuoglu) Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu) Copyright (c) 2011-2013 NYU (Clement Farabet) Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston) Copyright (c) 2006 Idiap Research Institute (Samy Bengio) Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz) Copyright (c) 2015 Google Inc. Copyright (c) 2015 Yangqing Jia Copyright (c) 2013-2016 The Caffe contributors All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license WARNING: Detected NVIDIA NVIDIA GeForce RTX 4090 GPU, which is not yet supported in this version of the container ERROR: No supported GPU(s) detected to run this container

NOTE: The SHMEM allocation limit is set to the default of 64MB. This may be insufficient for PyTorch. NVIDIA recommends the use of the following flags: docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 ...

I want to know how to fix it. Can someone give me some suggestions?

gitabtion commented 11 months ago

NGC images only started supporting 4090 series GPUs with version 22.09, but openbabel cannot be installed directly with conda for that version and later, and pydock, which CarsiDock relies on, must also be recompiled and installed. Please modify the DockerFile and rebuild. DockerFile should as follows:

FROM nvcr.io/nvidia/pytorch:22.09-py3
Change dgl-cuda to pip installation: RUN pip install dgl -f https://data.dgl.ai/wheels/cu118/repo.html
Install the package in requirements.txt: RUN pip install -r requirements.txt
Compile and install openbabel
Install pydock: clone https://github.com/gitabtion/dockcpp, then compile and install.

JasonIsaac-Lofty commented 11 months ago

Thank you for your suggestion! I tried the previous steps and ran smoothly, but when I build dockcpp from source, there appears some errors:

209.1 [ 92%] Built target cudock
209.2 [ 96%] Building CXX object test/CMakeFiles/dock_test.dir/dock_test.cpp.o
209.3 In file included from /app/dockcpp/src/dock.h:6,
209.3                  from /app/dockcpp/test/dock_test.cpp:3:
209.3 /app/dockcpp/src/dtype.h:5:2: error: #error must define USE_DOUBLE
209.3     5 | #error must define USE_DOUBLE
209.3       |  ^~~~~
209.6 /app/dockcpp/test/dock_test.cpp: In function ‘int main()’:
209.6 /app/dockcpp/test/dock_test.cpp:14:24: error: ‘createCudaDockRequest’ is not a member of ‘dock’
209.6    14 |     auto req   = dock::createCudaDockRequest(init_coord,
209.6       |                        ^~~~~~~~~~~~~~~~~~~~~
209.6 /app/dockcpp/test/dock_test.cpp: In function ‘int main2()’:
209.6 /app/dockcpp/test/dock_test.cpp:35:23: error: ‘createCudaDockGradPerfRequest’ is not a member of ‘dock’; did you mean ‘createCudaDockGradSubmitRequest’?
209.6    35 |     auto req  = dock::createCudaDockGradPerfRequest(init_coord,
209.6       |                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
209.6       |                       createCudaDockGradSubmitRequest
209.6 make[2]: *** [test/CMakeFiles/dock_test.dir/build.make:76: test/CMakeFiles/dock_test.dir/dock_test.cpp.o] Error 1
209.6 make[1]: *** [CMakeFiles/Makefile2:295: test/CMakeFiles/dock_test.dir/all] Error 2
209.6 make: *** [Makefile:136: all] Error 2
------
DockerFile:25
--------------------
  24 |     COPY ./dockcpp-main /app/dockcpp
  25 | >>> RUN cd /app/dockcpp && mkdir build && cd build && \
  26 | >>>     cmake .. && make && make install && \
  27 | >>>     cd ../../
  28 |
--------------------
ERROR: failed to solve: process "/bin/sh -c cd /app/dockcpp && mkdir build && cd build &&     cmake .. && make && make install &&     cd ../../" did not complete successfully: exit code: 2

I'm not good at C++ and I don't know how to deal with it. Would you please help me with it?

gitabtion commented 11 months ago

dockcpp is a python package, you should install it as following:

COPY ./dockcpp-main /app/dockcpp
RUN cd /app/dockcpp && mkdir build && cd build && \
        cmake .. && make install-python-package && \
        cd ../../

JasonIsaac-Lofty commented 11 months ago

Thank you very much! Now this program can run at RTX4090 smoothly!

I may share my new dockerfile as a reference:

# Use the NGC PyTorch image with version 22.09 that supports 4090 series GPUs
FROM nvcr.io/nvidia/pytorch:22.09-py3

# Change dgl-cuda to pip installation from the specified repository
RUN pip install dgl -f https://data.dgl.ai/wheels/cu118/repo.html

# Install build-essential
RUN apt-get install -y build-essential

# Copy the local openbabel source and compile/install(May be not necessary)
COPY ./openbabel-master /app/openbabel
RUN cd /app/openbabel && mkdir build && cd build && \
    cmake .. -DPYTHON_BINDINGS=ON && make && make install && \
    cd ../../

# Set the global pip index to Tsinghua University for faster installations
RUN pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

# Copy the requirements file and install the dependencies
COPY ./requirements.txt /app/requirements.txt
RUN pip install -r /app/requirements.txt

# Copy the local dockcpp source and compile/install pydock
COPY ./dockcpp-main /app/dockcpp
RUN cd /app/dockcpp && mkdir build && cd build && \
        cmake .. && make install-python-package && \
        cd ../../

# Install SWIG and openbabel
RUN apt-get update && apt-get install -y swig
RUN pip install openbabel

The compliation of local openbabel source may be unnecessary. Since I'm in mainland China, I can't directly use git clone command. As a result, I can only download them and complie them at local.

gitabtion commented 11 months ago

Thank you for sharing the dockerfile, which has been updated to the repository.

Le-Phung-Hien commented 8 months ago

Hi,

Just a quick update that the old docker file work for me (Ubuntu 22.04 LTS, Intel i9, A4000), but the new one is not. I have trouble installing Openbabel and Pydock with the new docker file.

carbonsilicon-ai / CarsiDock

Docker Container does not support RTX4090 #1