`ValueError: /workplace/spconv/src/spconv/spconv_ops.cc 87 unknown device type` error

Hi, thanks for sharing code. I am leaving an issue since I have trouble on running your code. I run a code without ddp python ./tools/train.py ./configs/nusc/nuscenes_centerformer_separate_detection_head.py, sh setup.sh works nicely. but here is follwing error when running train.py.

Traceback (most recent call last):
  File "./tools/train.py", line 137, in <module>
    main()
  File "./tools/train.py", line 132, in main
    logger=logger,
  File "/workspace/det3d/torchie/apis/train.py", line 335, in train_detector
    trainer.run(data_loaders, cfg.workflow, cfg.total_epochs, local_rank=cfg.local_rank)
  File "/workspace/det3d/torchie/trainer/trainer.py", line 546, in run
    epoch_runner(data_loaders[i], self.epoch, **kwargs)
  File "/workspace/det3d/torchie/trainer/trainer.py", line 413, in train
    self.model, data_batch, train_mode=True, **kwargs
  File "/workspace/det3d/torchie/trainer/trainer.py", line 371, in batch_processor_inline
    losses = model(example, return_loss=True)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/workspace/det3d/models/detectors/voxelnet_dynamic.py", line 52, in forward
    x, _ = self.extract_feat(example)
  File "/workspace/det3d/models/detectors/voxelnet_dynamic.py", line 38, in extract_feat
    data['voxels'], data["coors"], data["batch_size"], data["input_shape"]
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/workspace/det3d/models/backbones/scn.py", line 156, in forward
    x = self.conv_input(ret)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/spconv/modules.py", line 134, in forward
    input = module(input)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/spconv/conv.py", line 181, in forward
    use_hash=self.use_hash)
  File "/opt/conda/lib/python3.7/site-packages/spconv/ops.py", line 95, in get_indice_pairs
    int(use_hash))
ValueError: /workplace/spconv/src/spconv/spconv_ops.cc 87
unknown device type

I have tried hard to run your code on nuscenes dataset. We also have 8gpus of A100 settting as you do. One difference would be that I use docker image. Here is dockerfile.

FROM pytorch/pytorch:1.9.1-cuda11.1-cudnn8-devel
MAINTAINER Junho Cho <junh0.cho@samsung.com>

RUN rm /etc/apt/sources.list.d/cuda.list
RUN rm /etc/apt/sources.list.d/nvidia-ml.list
RUN apt-get update

RUN apt-get install git -y
RUN git clone https://github.com/TuSimple/centerformer.git

RUN cd centerformer && pip install -r requirements.txt

RUN apt-get install wget libboost-all-dev libgl1 -y

# Install cmake v3.13.2
RUN apt-get purge -y cmake && \
    mkdir /root/temp && \
    cd /root/temp && \
    wget https://github.com/Kitware/CMake/releases/download/v3.13.2/cmake-3.13.2.tar.gz && \
    tar -xzvf cmake-3.13.2.tar.gz && \
    cd cmake-3.13.2 && \
    bash ./bootstrap && \
    make && \
    make install && \
    cmake --version && \
    rm -rf /root/temp

RUN git clone --branch v1.2.1  https://github.com/traveller59/spconv.git --recursive
RUN cd spconv && python setup.py bdist_wheel && cd ./dist && pip install *whl

WORKDIR /workspace
ENV PYTHONPATH="${PYTHONPATH}:/workspace"

Through this dockerfile, we build spconv v1.2.1 on cuda 11.1 and pytorch 1.9.1 environment. This makes exact pytorch, cuda version as your setting. Only difference is python, but I think is not a big difference. (also tried python 3.9.12, but no luck).

sh setup.sh always works nicely.

seems following error

ValueError: /root/spconv/src/spconv/spconv_ops.cc 87
unknown device type

might be solved with using other spconv (according to https://github.com/traveller59/spconv/issues/58) , but I have not tried because you specified only spconv 1.2.1 works.

Would there be any idea to sort this issue?

Probably, spconv 1.2.1 does not work in docker accordint to this, but I confirmed spconv 2.2 worked in docker.

If this so, is there any chance this repo be able to support spconv 2.2? (I already tried spconv 2.2 for centerformer and failed a lot)

TuSimple / centerformer

`ValueError: /workplace/spconv/src/spconv/spconv_ops.cc 87 unknown device type` error #18