Issue

When I run ./dist_train.sh, I get the following error:

Traceback (most recent call last):                                                                                                                                                                                 
  File "./tools/train.py", line 263, in <module>    
    main()                                                                                                                                                                                                [35/1949]
  File "./tools/train.py", line 126, in main                                                                                                                                                                       
    plg_lib = importlib.import_module(_module_path)                                                                                                                                                                
  File "/root/miniconda3/envs/lib/python3.8/importlib/__init__.py", line 127, in import_module                                                                                                                     
    return _bootstrap._gcd_import(name[level:], package, level)                        
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load         
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked                                                                                                                                                
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed                                                                                                                                     
  File "/workspace/ViDAR/projects/mmdet3d_plugin/__init__.py", line 11, in <module>
    from .bevformer import *                                                                                                                                                                                       
  File "/workspace/ViDAR/projects/mmdet3d_plugin/bevformer/__init__.py", line 2, in <module>
    from .dense_heads import *                                                                           
  File "/workspace/ViDAR/projects/mmdet3d_plugin/bevformer/dense_heads/__init__.py", line 2, in <module>
    from .bev_head import BEVHead                                                                                                                                                                                  
  File "/workspace/ViDAR/projects/mmdet3d_plugin/bevformer/dense_heads/bev_head.py", line 22, in <module> 
    from projects.mmdet3d_plugin.bevformer.modules import PerceptionTransformerBEVEncoder                                                                                                                          
  File "/workspace/ViDAR/projects/mmdet3d_plugin/bevformer/modules/__init__.py", line 10, in <module>
    from .vidar_decoder import (PredictionDecoder,                                                       
  File "/workspace/ViDAR/projects/mmdet3d_plugin/bevformer/modules/vidar_decoder.py", line 22, in <module>
    from .ray_operations import LatentRendering
  File "/workspace/ViDAR/projects/mmdet3d_plugin/bevformer/modules/ray_operations/__init__.py", line 1, in <module>
    from .latent_rendering import LatentRendering
  File "/workspace/ViDAR/projects/mmdet3d_plugin/bevformer/modules/ray_operations/latent_rendering.py", line 12, in <module>
    from ...utils import e2e_predictor_utils                                                             
  File "/workspace/ViDAR/projects/mmdet3d_plugin/bevformer/utils/e2e_predictor_utils.py", line 163, in <module>
    from chamferdist import ChamferDistance
  File "/root/miniconda3/envs/lib/python3.8/site-packages/chamferdist-1.0.0-py3.8-linux-x86_64.egg/chamferdist/__init__.py", line 1, in <module>
    from .chamfer import ChamferDistance
  File "/root/miniconda3/envs/lib/python3.8/site-packages/chamferdist-1.0.0-py3.8-linux-x86_64.egg/chamferdist/chamfer.py", line 12, in <module>
    from chamferdist import _C
ImportError: /root/miniconda3/envs/lib/python3.8/site-packages/chamferdist-1.0.0-py3.8-linux-x86_64.egg/chamferdist/_C.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor7optionsEv

Apparently from https://github.com/pytorch/pytorch/blob/302ee7bfb604ebef384602c56e3853efed262030/aten/src/ATen/core/TensorBase.h#L472

How to reproduce

I am trying to run your code in a docker container which is created from a Dockerfile as follows:

ARG CUDA_VERSION=11.3.1
ARG OS_VERSION=20.04
# pull a prebuilt image
FROM nvidia/cuda:${CUDA_VERSION}-cudnn8-devel-ubuntu${OS_VERSION}

SHELL ["/bin/bash", "-c"]

# Required to build Ubuntu 20.04 without user prompts with DLFW container
ENV DEBIAN_FRONTEND=noninteractive

# Install requried libraries
RUN apt-get update && apt-get install -y software-properties-common
RUN add-apt-repository ppa:ubuntu-toolchain-r/test
RUN apt-get update && apt-get install -y --no-install-recommends \
    libcurl4-openssl-dev \
    wget \
    zlib1g-dev \
    git \
    sudo \
    ssh \
    libssl-dev \
    pbzip2 \
    pv \
    bzip2 \
    unzip \
    devscripts \
    lintian \
    fakeroot \
    dh-make \
    build-essential \
    curl \
    ca-certificates \
    libx11-6 \
    nano \
    graphviz \
    libgl1-mesa-glx \
    openssh-server \
    apt-transport-https

# Install other dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    libgtk2.0-0 \
    libcanberra-gtk-module \
    libsm6 libxext6 libxrender-dev \
    libgtk2.0-dev pkg-config \
    libopenmpi-dev \
 && sudo rm -rf /var/lib/apt/lists/*

# Install Miniconda
RUN wget \
    https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
    && mkdir /root/.conda \
    && bash Miniconda3-latest-Linux-x86_64.sh -b \
    && rm -f Miniconda3-latest-Linux-x86_64.sh 

ENV CONDA_DEFAULT_ENV=${project}
ENV CONDA_PREFIX=/root/miniconda3/envs/$CONDA_DEFAULT_ENV
ENV PATH=/root/miniconda3/bin:$CONDA_PREFIX/bin:$PATH

# install python 3.8
RUN conda install python=3.8
RUN alias python='/root/miniconda3/envs/bin/python3.8'

# Set environment and working directory
ENV CUDA_HOME=/usr/local/cuda
ENV LD_LIBRARY_PATH=$CUDA_HOME/lib64:$CUDA_HOME/extras/CUPTI/lib64/:$LD_LIBRARY_PATH
ENV PATH=$CUDA_HOME/bin:$PATH
ENV CFLAGS="-I$CUDA_HOME/include $CFLAGS"
ENV FORCE_CUDA="1"
ENV PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/miniconda3/envs/bin:$PATH

# install pytorch
RUN pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html

# install opencv
RUN python -m pip install opencv-python==4.5.5.62

# install gcc
RUN conda install -c omgarcia gcc-6 -y

# install torchpack
RUN git clone https://github.com/zhijian-liu/torchpack.git
RUN cd torchpack && python -m pip install -e .

# install other dependencies
RUN python -m pip install mmcv-full==1.4.0 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/index.html
RUN python -m pip install pillow==8.4.0 \
                          tqdm \
                          mmdet==2.14.0 \
                          mmsegmentation==0.14.1 \
                          numba \
                          mpi4py \
                          nuscenes-devkit \
                          setuptools==59.5.0

# install mmdetection3d from source
ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX"
ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all"
ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../"

RUN apt-get update && apt-get install -y ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*
RUN git clone https://github.com/open-mmlab/mmdetection3d.git && \
    cd mmdetection3d && \
    git checkout v0.17.1 && \
    python -m pip install -r requirements/build.txt && \
    python -m pip install --no-cache-dir -e .

# install timm
RUN python -m pip install timm

# libraries path
RUN ln -s /usr/local/cuda/lib64/libcusolver.so.11 /usr/local/cuda/lib64/libcusolver.so.10

RUN pip install einops fvcore seaborn \
    iopath==0.1.9 \
    timm==0.6.13 \
    typing-extensions==4.5.0 \
    pylint \
    ipython==8.12 \
    numpy==1.19.5 \
    matplotlib==3.5.2 \
    numba==0.48.0 \
    pandas==1.4.4 \
    scikit-image==0.19.3 \
    setuptools==59.5.0
RUN python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'

RUN mkdir /workspace && \
    chmod -R a+w /workspace && \
    cd /workspace

USER root
RUN ["/bin/bash"]

Inside the docker container, I setup the chamferdist package as written in the readme.

# python -c "import torch; print(torch.__version__)"
1.10.1+cu111

tomztyang commented 4 months ago

Thanks for your attention.

How about trying the chamfer distance at 4D-Occ?

Like, you should:

pip uninstall chamferdist
git clone https://github.com/tarashakhurana/4d-occ-forecasting
cd utils/chamferdist
pip install .

Tell me if it works.

Thanks, Zetong

tomztyang commented 4 months ago

I will close it for now. Feel free to open it in case of future questions!

kminoda commented 4 months ago

Thank you for your quick response. It somehow solved my issue.

OpenDriveLab / ViDAR

`undefined symbol` ImportError with `from chamferdist import _C` #5

Issue

How to reproduce