Fails to build using official PyTorch Docker images (2.4.0-cuda12.4-cudnn9, 2.3.1-cuda12.1-cudnn8-runtime)

JeongJuhyeon commented 1 month ago

Images tried:

pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime - CUDA 11.7, CUDNN8, PyTorch 2.0.1. This is the one I use with SAM1, works fine. But possibly too old for SAM2. pytorch/pytorch:2.4.0-cuda12.4-cudnn9-runtime - CUDA 12.4, CUDNN9, PyTorch 2.4.0 pytorch/pytorch:2.3.1-cuda12.1-cudnn8-runtime - CUDA 12.1, CUDNN8, PyTorch 2.3.1

Error:

140.4   × Getting requirements to build wheel did not run successfully.
140.4   │ exit code: 1
140.4   ╰─> [29 lines of output]
140.4       /tmp/pip-build-env-rji1we1b/overlay/lib/python3.11/site-packages/torch/_subclasses/functional_tensor.py:258: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
140.4         cpu = _conversion_method_template(device=torch.device("cpu"))
140.4       Traceback (most recent call last):
140.4         File "/opt/conda/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
140.4           main()
140.4         File "/opt/conda/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
140.4           json_out['return_val'] = hook(**hook_input['kwargs'])
140.4                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
140.4         File "/opt/conda/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
140.4           return hook(config_settings)
140.4                  ^^^^^^^^^^^^^^^^^^^^^
140.4         File "/tmp/pip-build-env-rji1we1b/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 327, in get_requires_for_build_wheel
140.4           return self._get_build_requires(config_settings, requirements=[])
140.4                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
140.4         File "/tmp/pip-build-env-rji1we1b/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 297, in _get_build_requires
140.4           self.run_setup()
140.4         File "/tmp/pip-build-env-rji1we1b/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 313, in run_setup
140.4           exec(code, locals())
140.4         File "<string>", line 70, in <module>
140.4         File "<string>", line 51, in get_extensions
140.4         File "/tmp/pip-build-env-rji1we1b/overlay/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1076, in CUDAExtension
140.4           library_dirs += library_paths(cuda=True)
140.4                           ^^^^^^^^^^^^^^^^^^^^^^^^
140.4         File "/tmp/pip-build-env-rji1we1b/overlay/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1207, in library_paths
140.4           if (not os.path.exists(_join_cuda_home(lib_dir)) and
140.4                                  ^^^^^^^^^^^^^^^^^^^^^^^^
140.4         File "/tmp/pip-build-env-rji1we1b/overlay/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2416, in _join_cuda_home
140.4           raise OSError('CUDA_HOME environment variable is not set. '
140.4       OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
140.4       [end of output]

Sample Dockerfile:

FROM pytorch/pytorch:2.3.1-cuda12.1-cudnn8-runtime

WORKDIR /app

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
    libgl1-mesa-glx \
    libglib2.0-0 \
    git \
    build-essential \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY handler.py .
COPY utils.py .
COPY weights/sam2_hiera_large.pt /app/weights/

CMD ["python", "handler.py"]

Requirements.txt git+https://github.com/facebookresearch/segment-anything-2.git

Is there a different combination of versions that should be used?

Or is this because the host does not have a GPU? We should be able to at least build the wheel/image on a GPU-less machine so that we can then run it on external (e.g. cloud) GPUs. segment-anything (v1) worked fine that way. Unless there's a pre-built wheel or image that we can use so that we don't have to build it ourselves.

aryansaurav commented 1 month ago

You have to update gcc to version 9.3 or higher.. worked for me! worth a try.

JeongJuhyeon commented 1 month ago

@aryansaurav Thanks for the suggestion :) I've added apt-get build-essential to ensure gcc is up-to-date, but unfortunately the result is the same. Are you also using a Docker image on a GPUless machine?

aryansaurav commented 1 month ago

No didn't use docker but check gcc --version If it's 9.30 or higher, then might be another issue

Oh and I had torch 2.40 with cuda 12.1 for the sake of completion

But again there can be other issues

On Tue, Jul 30, 2024, 19:15 JeongJuhyeon @.***> wrote:

@aryansaurav https://github.com/aryansaurav Thanks for the suggestion :) I've added apt-get build-essential to ensure gcc is up-to-date, but unfortunately the result is the same. Are you also using a Docker image on a GPUless machine?

— Reply to this email directly, view it on GitHub https://github.com/facebookresearch/segment-anything-2/issues/37#issuecomment-2258834035, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKVC2DFD3VOARV3MN3BQ7G3ZO7C4JAVCNFSM6AAAAABLWRV32GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJYHAZTIMBTGU . You are receiving this because you were mentioned.Message ID: @.***>

jeanchristopheruel commented 1 month ago

Segment Anything 2.0 require to compile a .cu file with nvcc at build time. Hence, a cuda devel baseImage is required to build the library. Try with pytorch/pytorch:2.0.1-cuda11.7-cudnn8-devel

peasant98 commented 1 month ago

Hi, I will be getting up SAM2 + Docker as well tonight:

Check out the repo here https://github.com/peasant98/SAM2-Docker

jeanchristopheruel commented 1 month ago

I spent a significant amount of time containerizing SAM2 for CVAT. I suggest you benefit my work, no offence! Check out my containerization here : Sam2-Container.

You can fetch a pre-built image using docker run -it --rm --gpus all jeanchristopheruel/sam2-container:latest bash

facebookresearch / segment-anything-2

Fails to build using official PyTorch Docker images (2.4.0-cuda12.4-cudnn9, 2.3.1-cuda12.1-cudnn8-runtime) #37