Closed ErlerPhilipp closed 2 years ago
After half a day reverse-engineering the requirements, I got this poco.yaml:
name: poco
channels:
- pytorch
- pyg
- open3d-admin
- anaconda
- conda-forge
- defaults
dependencies:
- python=3.7.10
- pytorch::pytorch=1.8.1
- pytorch::torchvision=0.9.1
- pytorch::torchaudio=0.8.1
- cudatoolkit=11.1
- cython
- tqdm
- scikit-image
- open3d
- scikit-learn
- pyyaml
- addict
- pandas
- plyfile
- pyg=2.0.1
- pip
- pip:
- open3d
- trimesh
Seems to work with setup.py and generate.py. Hope this helps someone.
Thanks a lot. I have added it in the repo for a conda installation.
@aboulch in my case, the generate script is pretty slow with ~20 min per object. is this normal? should this be multi-threaded by default?
On scenes yes, it is quite slow. However on shapenet objects, it should take around 10->20s per object depending on your hardware. In my case it was a 6 cpu threads on a Intel(R) Xeon(R) CPU E5-2630 and a 2080ti GPU.
I could reproduce the issue with the proposed yml file. I will look into that.
Thanks!
I'm trying to reproduce the ABC dataset first. my GPU is almost idle and only one CPU core is occupied. could this be related to OpenMP?
Btw. why do you compile Pykdtree? It's also available as simple conda package.
First install the packages:
apt-get install libgl1-mesa-glx libopenblas-dev
--> the problem may come from the openblas missing
Create a minimal conda environment (if needed, it only installs python, cudatoolkit and pip):
conda env create -f environment.yml
conda activate poco
Installing dependencies with pip
pip install -r requirements.txt
Build the compiled library (needed only for evaluation)
python setup.py build_ext --inplace
Note: I will remove the dependency to the compiled pykdtree.
Hi,
I tried to reproduce your results, but I ran into a possible version mismatch between Pytorch and Pytorch_geometric.
I created my environment with the following commands:
conda create --name poco python=3.7.10 conda install pytorch==1.8.1 torchvision==0.9.1 torchaudio==0.8.1 cudatoolkit=11.1 -c pytorch -c conda-forge conda install -c conda-forge cython conda install -c conda-forge tqdm conda install -c conda-forge scikit-image conda install -c open3d-admin open3d conda install -c conda-forge scikit-learn conda install -c conda-forge pyyaml conda install -c conda-forge addict conda install -c conda-forge pandas conda install -c conda-forge plyfile conda install -c conda-forge pytorch_geometric
Compilation with
python setup.py build_ext --inplace
seems to work butpython generate.py --config results/ABC_10k_FKAConv_InterpAttentionKHeadsNet_None/config.yaml --dataset_name DATASET_NAME --dataset_root data/3d_shapes_abc/abc/ --gen_resolution_global 256
results inOSError: /home/perler/miniconda3/envs/poco/lib/python3.7/site-packages/torch_sparse/_version.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Installed versions are:
(poco) perler@BOTTLE:~/repos/poco$ conda list pytorch # packages in environment at /home/perler/miniconda3/envs/poco: # # Name Version Build Channel pytorch 1.8.1 py3.7_cuda11.1_cudnn8.0.5_0 pytorch pytorch-cpu 1.1.0 py3.7_cpu_0 pytorch pytorch_geometric 2.0.3 pyh6c4a22f_0 conda-forge pytorch_sparse 0.6.4 py37hcae2be3_0 conda-forge
Again, the CPU-version... but that's a different issue.
AFAIK, they added sparse tensors only recently to Pytorch, so the installed Pytorch-geometric version might be too new. Which version of Pytorch-geometric do I need?
Can you please create a requiremtents.txt and/or environment.yaml?
I copied your steps but get ninja build issue at python setup.py build_ext --inplace
, which is the same when following the steps in readme.
--update: the error above was fixed by modifying ninja build part in lib/python3.6/site-packages/torch/utils/cpp_extension.py.
Hello,
here are the versions installed in my conda environnement:
# Name Version Build Channel
python 3.7.10 hf930737_104_cpython conda-forge
cudatoolkit 11.1.1 h6406543_10 conda-forge
openssl 3.0.2 h166bdaf_1 conda-forge
pip 20.2.4 py37_0 anaconda
wheel 0.35.1 py_0 anaconda
And versions installed using the pip requirements file
cython 0.29.28 pypi_0 pypi
numpy 1.21.5 pypi_0 pypi
open3d 0.13.0 pypi_0 pypi
pandas 1.3.5 pypi_0 pypi
plyfile 0.7.4 pypi_0 pypi
pykdtree 1.3.4 pypi_0 pypi
scikit-image 0.19.2 pypi_0 pypi
scikit-learn 1.0.2 pypi_0 pypi
scipy 1.7.3 pypi_0 pypi
tensorboard 2.8.0 pypi_0 pypi
torch 1.8.1+cu111 pypi_0 pypi
torch-cluster 1.5.9 pypi_0 pypi
torch-geometric 2.0.4 pypi_0 pypi
torch-scatter 2.0.8 pypi_0 pypi
torch-sparse 0.6.12 pypi_0 pypi
torch-spline-conv 1.2.1 pypi_0 pypi
torchaudio 0.8.1 pypi_0 pypi
torchvision 0.9.1+cu111 pypi_0 pypi
tqdm 4.64.0 pypi_0 pypi
trimesh 3.10.7 pypi_0 pypi
@aboulch Thanks for the update. However, with pip install -r requirements.txt
, I get this:
ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.
We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.
jupyter-packaging 0.12.0 requires setuptools>=60.2.0, but you'll have setuptools 50.3.0.post20201006 which is incompatible.
open3d 0.13.0 requires wheel>=0.36.0, but you'll have wheel 0.35.1 which is incompatible.
Pip still installed the packages.
When I run the generate script, I get a more serious error:
(poco) perler@BOTTLE:~/repos/poco$ python generate.py --config results/ABC_10k_FKAConv_InterpAttentionKHeadsNet_None/config.yaml --dataset_name ABCTest --dataset_root data/3d_shapes_abc/ --gen_resolution_global 128
Traceback (most recent call last):
File "generate.py", line 11, in <module>
import torch_geometric.transforms as T
File "/home/perler/miniconda3/envs/poco/lib/python3.7/site-packages/torch_geometric/__init__.py", line 4, in <module>
import torch_geometric.data
File "/home/perler/miniconda3/envs/poco/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
from .data import Data
File "/home/perler/miniconda3/envs/poco/lib/python3.7/site-packages/torch_geometric/data/data.py", line 9, in <module>
from torch_sparse import SparseTensor
File "/home/perler/miniconda3/envs/poco/lib/python3.7/site-packages/torch_sparse/__init__.py", line 16, in <module>
f'{library}_{suffix}', [osp.dirname(__file__)]).origin)
File "/home/perler/miniconda3/envs/poco/lib/python3.7/site-packages/torch/_ops.py", line 104, in load_library
ctypes.CDLL(path)
File "/home/perler/miniconda3/envs/poco/lib/python3.7/ctypes/__init__.py", line 364, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcusparse.so.11: cannot open shared object file: No such file or directory
Looks like this issue: https://github.com/pyg-team/pytorch_geometric/issues/2040
Pip installed torch-geometric 2.0.4 pypi_0 pypi
export PATH="~/miniconda3/envs/poco/lib/:$PATH"
doesn't help although the libcusparse.so.11 is there.
Any ideas?
I do not really have an answer on this one.
Before creating the conda environment is cuda 11.1 installed on the machine?
On my side I start with docker image:
nvidia/cuda:11.1-devel-ubuntu18.04
If you are using a different initial cuda, you may want to change the cuda versions in the environment.yaml
and requirements.txt
@aboulch Thanks again! For some reason, neither my WSL 2 nor native Ubuntu worked but the docker image does. I guess it's some strange CUDA / Pytorch version mismatch. Looks like CUDA 11.1 is not recommended for Pytorch 1.8.1 although this combination exists for Pip (but not conda).
Anyway, the generate script runs now at 10-15s per ABC object. If there are no further problems, I'll close the issue soon.
Seems to work (didn't try training yet). Thanks!
Hi,
I tried to reproduce your results, but I ran into a possible version mismatch between Pytorch and Pytorch_geometric.
I created my environment with the following commands:
Compilation with
python setup.py build_ext --inplace
seems to work butpython generate.py --config results/ABC_10k_FKAConv_InterpAttentionKHeadsNet_None/config.yaml --dataset_name DATASET_NAME --dataset_root data/3d_shapes_abc/abc/ --gen_resolution_global 256
results inOSError: /home/perler/miniconda3/envs/poco/lib/python3.7/site-packages/torch_sparse/_version.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Installed versions are:
Again, the CPU-version... but that's a different issue.
AFAIK, they added sparse tensors only recently to Pytorch, so the installed Pytorch-geometric version might be too new. Which version of Pytorch-geometric do I need?
Can you please create a requiremtents.txt and/or environment.yaml?