Open Kitsunetic opened 2 weeks ago
I'm experiencing the exact same issue.
I've found that it works fine with kernel_size=1, but consistently crashes with kernel_size=3 or any other size.
@Kitsunetic Have you fixed this issue?
I'm experiencing the exact same issue.
I've found that it works fine with kernel_size=1, but consistently crashes with kernel_size=3 or any other size.
@Kitsunetic Have you fixed this issue?
No, I'm still figuring out the solution.
I found that downgrading PyTorch to version 2.2.2 resolves the issue.
which cuda version did you use?
I use CUDA 12.1, and I installed spconv-cu120.
Unfortunately, I'm still getting same issue with my retrial on nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04 docker image with Pytorch 2.2.2 with CUDA 12.1. I have tested with both ubuntu 22.04 and 20.04. Couly you give me more detail about your environment?
I set up the environment using the following .yaml file with conda env create -f ***.yaml. This is a different .yaml file than the one referenced in Issue #317, particularly with torch and torchvision configurations.
name: pointcept
channels:
- pyg
- pytorch
- nvidia/label/cuda-12.1.1
- nvidia
- bioconda
- conda-forge
- defaults
dependencies:
- python=3.9
- pip
- cuda
- conda-forge::cudnn
- gcc=12.1
- gxx=12.1
- pytorch=2.2.2
- torchvision=0.17.2
- pytorch-cuda=12.1
- ninja
- google-sparsehash
- h5py
- pyyaml
- tensorboard
- tensorboardx
- yapf
- addict
- einops
- scipy
- plyfile
- termcolor
- timm
- ftfy
- regex
- tqdm
- matplotlib
- black
- open3d
- pytorch-cluster
- pytorch-scatter
- pytorch-sparse
- pip:
- torch_geometric
# - spconv-cu120
- git+https://github.com/octree-nn/ocnn-pytorch.git
- git+https://github.com/openai/CLIP.git
- git+https://github.com/Dao-AILab/flash-attention.git
- ./libs/pointops
- ./libs/pointgroup_ops
After this setup, I installed the following additional components:
cd libs/pointops
python setup.py install
cd ../..
pip install spconv-cu120
Thank you for sharing. However... I'm still getting same error even with environment based on provided yaml file. I expect this is not only the problem of dependencies, but also entire environment like OS can be related. So, I'm still figuring out the reason. Anyway, thank you again for your sharing! If you found another clue, please share with me!
I always get floating point exception while I'm using SubMConv3d.
Here is my test code:
I'm using PyTorch 2.3.0 with CUDA 11.8, and spconv-cu18==2.3.6. Is there something wrong in my code, or someone knows the clue?
I have tested with A5000 and RTX 2080Ti GPUs but the result was always same.