zhxfl / CUDA-CNN

CNN accelerated by cuda. Test on mnist and finilly get 99.76%
184 stars 85 forks source link

cuMatrix host memory allocation failed #6

Open FelixZhang00 opened 8 years ago

FelixZhang00 commented 8 years ago

when I run ./CUDA-CNN 1 and get this error 'cuMatrix host memory allocation failed'.Can you fix this. I run this on Max OSX 10.11.

zhxfl commented 8 years ago

I think your computer's memory is limited. You can edit the config file to resize the network

FelixZhang00 commented 8 years ago

I run this code on a 16GB-Memory MacbookPro,does it ok? After I edit the config file to resize the network,I get the same error.The total message is:

  1. MNIST
  2. CIFAR-10
  3. CHINESE
  4. CIFAR-100
  5. VOTE MNIST Choose the dataSet to run:1

**CONFIG****** Is Grandient Checking : 0 batch Size : 32 channels : 1 crop : 0 scale : 12.000000 rotation : 12.000000 distortion : 3.400000 imageShow : 0 HORIZONTAL : 0 Test_Epoch : 10000 White Noise : 0.000000

_data Layer_ NAME : data

_convcfm layer_ NAME : conv1 INPUT : data SUBINPUT : NULL KERNEL_SIZE : 5 KERNEL_AMOUNT : 8 PADDING : 0 WEIGHT_DECAY : 0.000001 initW : 0.010000 non_linearity : NL_LRELU

_pooling layer_ NAME : pooling1 INPUT : conv1 POOLINGTYPE : max SUBINPUT : NULL size : 2 skip : 2 non_linearity : NULL

_convcfm layer_ NAME : conv2 INPUT : pooling1 SUBINPUT : NULL KERNEL_SIZE : 5 KERNEL_AMOUNT : 16 PADDING : 0 WEIGHT_DECAY : 0.000001 initW : 0.100000 non_linearity : NL_RELU

_pooling layer_ NAME : pooling2 INPUT : conv2 POOLINGTYPE : max SUBINPUT : NULL size : 2 skip : 2 non_linearity : NULL

_Full Connect Layer_ NAME : fc1 INPUT : pooling2 SUBINPUT : NULL NUM_FULLCONNECT_NEURONS : 512 WEIGHT_DECAY : 0.000000 DROPOUT_RATE : 0.500000 initW : 0.010000 non_linearity : NL_RELU

_SoftMax Layer_ NAME : softmax1 INPUT : fc1 SUBINPUT : NULL NUM_CLASSES : 10 WEIGHT_DECAY : 0.000001 initW : 0.100000 non_linearity: NULL

cuMatrix:cuMatrix host memory allocation failed

在 2016年3月3日,19:35,zhxfl notifications@github.com 写道:

I think your computer's memory is limited. You can edit the config file to resize the network

— Reply to this email directly or view it on GitHub https://github.com/zhxfl/CUDA-CNN/issues/6#issuecomment-191719349.

zhxfl commented 8 years ago

What is your GPU version? You should check your cuda drive is installed sussessfully.

FelixZhang00 commented 8 years ago

I think the environment is correct,and I can run the cuda code to get the GPU info: --- General Information for device 0 --- Name: GeForce GT 750M Compute capability: 3.0 Clock rate: 925500 Device copy overlap: Enabled Kernel execution timeout : Enabled --- Memory Information for device 0 --- Total global mem: 2147024896 Total constant Mem: 65536 Max mem pitch: 2147483647 Texture Alignment: 512 --- MP Information for device 0 --- Multiprocessor count: 2 Shared mem per mp: 49152 Registers per mp: 65536 Threads in warp: 32 Max threads per block: 1024 Max thread dimensions: (1024, 1024, 64) Max grid dimensions: (2147483647, 65535, 65535)

在 2016年3月3日,20:27,zhxfl notifications@github.com 写道:

What is your GPU version? You should check your cuda drive is installed sussessfully.

— Reply to this email directly or view it on GitHub https://github.com/zhxfl/CUDA-CNN/issues/6#issuecomment-191736961.

zhxfl commented 8 years ago

Compute capability: 3.0, Your IDE should config the gpu capability 3.0 or edit the cmakelist.txt

zhxfl commented 8 years ago

Have you run "sh get_mnist.sh" to get the datas

FelixZhang00 commented 8 years ago

Yes,I have these data on the correct path.

zhxfl commented 8 years ago

Can you debug the code and give me more informantion? I need the Stack information.

FelixZhang00 commented 8 years ago

I try just read mnist to cuMatrixVector trainX, and not compile layers ,is ok.But when I compile and link the layers to target , the error happen again.Why this happen?

FelixZhang00 commented 8 years ago

I find the place of this error. in Pooling.cu,if delete 3 functions ,then the code can run correctly. These functions: g_pooling_feedforward_max g_pooling_backpropagation_max_no_atomic g_pooling_backpropagation_avr_no_atomic

zhxfl commented 8 years ago

I have remove the code "//cudaFuncSetCacheConfig(g_pooling_feedforward_max, cudaFuncCachePreferL1);", You can try it again.

FelixZhang00 commented 8 years ago

by the way,when use you config to train mnist,it's too slow. epoch=1 time=194.826s , is it normal?

zhxfl commented 8 years ago

Yes, It is normal. You can editor the config file to improve the performance

chengquan commented 6 years ago

我这边选择1以后 最后显示 total malloc cpu memory 0.00000MB total malloc Gpu memory 0.00000MB cuMatrix Vector operator[] error 楼主 求指导以下- -找不到原因了。。