NervanaSystems / neon

Intel® Nervana™ reference deep learning framework committed to best performance on all hardware
http://neon.nervanasys.com/docs/latest
Apache License 2.0
3.87k stars 811 forks source link

[With NVIDIA K40c] K dim must be multiple of 4 #437

Closed moderato closed 5 years ago

moderato commented 6 years ago

Hello! I got this error when I try to initialize a model with the input data:

Traceback (most recent call last):
  File "Neon.py", line 204, in <module>
    mlp.initialize(neon_train_set, neon_cost)
  File "/home/m092926/miniconda3/envs/neon/lib/python3.5/site-packages/nervananeon-2.6.0-py3.5.egg/neon/models/model.py", line 122, in initialize
    prev_input = self.layers.configure(prev_input)
  File "/home/m092926/miniconda3/envs/neon/lib/python3.5/site-packages/nervananeon-2.6.0-py3.5.egg/neon/layers/container.py", line 328, in configure
    in_obj = l.configure(in_obj)
  File "/home/m092926/miniconda3/envs/neon/lib/python3.5/site-packages/nervananeon-2.6.0-py3.5.egg/neon/layers/layer.py", line 965, in configure
    super(Convolution_bias, self).configure(in_obj)
  File "/home/m092926/miniconda3/envs/neon/lib/python3.5/site-packages/nervananeon-2.6.0-py3.5.egg/neon/layers/layer.py", line 863, in configure
    self.nglayer = self.be.conv_layer(self.be.default_dtype, **self.convparams)
  File "/home/m092926/miniconda3/envs/neon/lib/python3.5/site-packages/nervananeon-2.6.0-py3.5.egg/neon/backends/nervanagpu.py", line 1949, in conv_layer
    dil_d, dil_h, dil_w)
  File "/home/m092926/miniconda3/envs/neon/lib/python3.5/site-packages/nervananeon-2.6.0-py3.5.egg/neon/backends/layer_gpu.py", line 422, in __init__
    self.fprop_kernels = convolution.FpropCuda(*args)
  File "/home/m092926/miniconda3/envs/neon/lib/python3.5/site-packages/nervananeon-2.6.0-py3.5.egg/neon/backends/convolution.py", line 140, in __init__
    assert K % self.vec_size == 0, "K dim must be multiple of %d" % self.vec_size
AssertionError: K dim must be multiple of 4

I searched a bit and found the answer to a similar bug here: https://github.com/NervanaSystems/neon/issues/318 which is saying this is caused by some kernels which don't support pre-Maxwell GPUs.

My code of Resnet-32 runs well on K40, while the above bug comes from another program with a custom model. I guess it's because the sizes of some of its intermediate layers are not multiple of 4 since I start with an input size (3, 48, 48) and a batch size of 64 (correct me if I'm wrong). Can anybody tell me if there's still any hope to fix it? Thanks!

moderato commented 5 years ago

Looks like Neon is no longer alive. Sadly close this issue.