tscohen / GrouPy

Group Equivariant Convolutional Neural Networks
http://ta.co.nl
Other
349 stars 85 forks source link

Can't install GrouPy successfully #13

Open shuida opened 5 years ago

shuida commented 5 years ago

Hey Dr.Cohen, Thank you for your creative work in group convolutions.I get some trouble when I install Groupy according to README.md.

After installing chainer with cupy and tensorflow-gpu, I run "chainer.backends.cuda.available" and "chainer.backends.cuda.cudnn_enabled", they all return True. However, when I run "nosetests -v", it shows as below. Failure: CompileException (/tmp/tmpezKDyD/kern.cu(14): error: a value of type "const ptrdiff_t " cannot be used to initialize an entity of type "const int " /tmp/tmpezKDyD/kern.cu(15): error: a value of type "const ptrdiff_t " cannot be used to initialize an entity of type "const int "

I just skiped it and run the examples in the "Getting Started". The tensorflow example can run without any error, but the chainer example has some wrong as shown below. File "/data/mbqiu/anaconda3/envs/gconv-python2.7/lib/python2.7/site-packages/groupy/gconv/chainer_gconv/transform_filter.py", line 8, in from groupy.gconv.chainer_gconv.kernels.integer_indexing_cuda_kernel import grad_index_group_func_kernel ImportError: No module named kernels.integer_indexing_cuda_kernel.

I checked carefully and found that the two subdirectories--kernels and pooling--is lost from gconv/chainer_gconv/.

my version is ubuntu==16.04, cuda==8.0, chainer==4.5.0, cupy-cuda80==4.5.0, tensorflow-gpu=1.4.0. What's wrong?

shuida commented 5 years ago

@tscohen Thank you for your suggestion. It helps me a lot. I'm not sure whether there is any wrong in your GrouPy/setup.py. After reinstalling chainer for version 1.24.0 and GrouPy and runing examples, it shows File "/data/mbqiu/anaconda3/envs/gconv-python2.7/lib/python2.7/site-packages/groupy/gconv/chainer_gconv/transform_filter.py", line 8, in from groupy.gconv.chainer_gconv.kernels.integer_indexing_cuda_kernel import grad_index_group_func_kernel ImportError: No module named kernels.integer_indexing_cuda_kernel.

So I modified your setup.py in GrouPy, adding 'groupy.gconv.chainer_gconv.kernels' and 'groupy.gconv.chainer_gconv.pooling' into the parameter 'packages' in the setup() function. Then it can work.

Another problem is that when I run the chainer example in "Getting Started", it shows 'cuDNN is not enabled'. Actually when I run "chainer.backends.cuda.available" and "chainer.backends.cuda.cudnn_enabled", they all return "True". It means that my cuda and cudnn have benn installed succesfully. Why is the cudnn unenabled with chainer1.24.0? My cudnn version is cudnn==5.1.10. In the chainer1.24.0 installation instruction https://docs.chainer.org/en/v1.24.0/install.html, it shows cudnn5.1 is supported by chainer1.24.0. What's wrong with it? What's the chainer version when you develop groupy?

Thanks!

tscohen commented 5 years ago

Right now I don't remember the chainer version, but it was the latest one available around the time I released this repo. Can you try an older version of the repo, like march 2017 or July 2016? Make sure you're also using a compatible version of CUDA/CUDNN. (Sorry I don't have a lot of time to dig deeper right now)

shuida commented 5 years ago

ok, I will try it again. Thank you very much!

sometimescasey commented 5 years ago

FWIW, I ran into this error as well and it can be resolved with a pretty simple change. I assume at some point between 2016 and now the cupy CArray.shape() and .stride() return values changed from int to ptrdiff_t.

Failure: CompileException (/tmp/tmpezKDyD/kern.cu(14): error: a value of type "const ptrdiff_t *" cannot be used to initialize an entity of type "const int *"
/tmp/tmpezKDyD/kern.cu(15): error: a value of type "const ptrdiff_t *" cannot be used to initialize an entity of type "const int *"

The PR below contains the changes I had to make to run gconv_experiments using Python 3.6.7 and chainer 5.1.0. Hope it helps!

Ning0Luo commented 5 years ago

Hi, I came across the same issue when I run $ nosetests -v The error thrown out is : `Failure: CompileException (/tmp/tmpwWc9HY/kern.cu(14): error: a value of type "const ptrdiff_t " cannot be used to initialize an entity of type "const int "

/tmp/tmpwWc9HY/kern.cu(15): error: a value of type "const ptrdiff_t " cannot be used to initialize an entity of type "const int " ` I use chainer 5.1. Could you tell me how to address this failure or it doesn't matter?

sometimescasey commented 5 years ago

Hi, I came across the same issue when I run $ nosetests -v The error thrown out is : `Failure: CompileException (/tmp/tmpwWc9HY/kern.cu(14): error: a value of type "const ptrdiff_t " cannot be used to initialize an entity of type "const int "

/tmp/tmpwWc9HY/kern.cu(15): error: a value of type "const ptrdiff_t " cannot be used to initialize an entity of type "const int " ` I use chainer 5.1. Could you tell me how to address this failure or it doesn't matter?

@Ning0Luo I'm pretty sure I came across the same issue, I resolved as follows: https://github.com/tscohen/GrouPy/pull/18/commits/8bf53a9a393ce1a01de9abfe2d53ef322762944b