CIFAR10 Segmentation fault

hokchhaytann commented 8 years ago

I am getting segmentation fault in the second epoch: *1 Tesla K40c 11519MB Compute capability: 3.5 Sparse CNN - dimension=2 nInputFeatures=3 nClasses=10 0:Convolution 2^2x3->12 1:Learn 12->32 dropout=0 PReLU 2:Pseudorandom overlapping Fractional Max Pooling 1.25989 2 3:Convolution 2^2x32->128 4:Learn 128->64 dropout=0 PReLU 5:Pseudorandom overlapping Fractional Max Pooling 1.25989 2 6:Convolution 2^2x64->256 7:Learn 256->96 dropout=0 PReLU 8:Pseudorandom overlapping Fractional Max Pooling 1.25989 2 9:Convolution 2^2x96->384 10:Learn 384->128 dropout=0 PReLU 11:Pseudorandom overlapping Fractional Max Pooling 1.25989 2 12:Convolution 2^2x128->512 13:Learn 512->160 dropout=0 PReLU 14:Pseudorandom overlapping Fractional Max Pooling 1.25989 2 15:Convolution 2^2x160->640 16:Learn 640->192 dropout=0 PReLU 17:Pseudorandom overlapping Fractional Max Pooling 1.25989 2 18:Convolution 2^2x192->768 19:Learn 768->224 dropout=0 PReLU 20:Pseudorandom overlapping Fractional Max Pooling 1.25989 2 21:Convolution 2^2x224->896 22:Learn 896->256 dropout=0 PReLU 23:Pseudorandom overlapping Fractional Max Pooling 1.25989 2 24:Convolution 2^2x256->1024 25:Learn 1024->288 dropout=0 PReLU 26:Pseudorandom overlapping Fractional Max Pooling 1.25989 2 27:Convolution 2^2x288->1152 28:Learn 1152->320 dropout=0 PReLU 29:Pseudorandom overlapping Fractional Max Pooling 1.25989 2 30:Convolution 2^2x320->1280 31:Learn 1280->352 dropout=0 PReLU 32:Pseudorandom overlapping Fractional Max Pooling 1.25989 2 33:Convolution 2^2x352->1408 34:Learn 1408->384 dropout=0 PReLU 35:Pseudorandom overlapping Fractional Max Pooling 1.5 2 36:Convolution 2^2x384->1536 37:Learn 1536->416 dropout=0 PReLU 38:Learn 416->448 dropout=0 PReLU 39:Learn 448->10 dropout=0 Softmax Classification (2,3) (4,5) (6,8) (9,11) (12,15) (16,20) (21,26) (27,34) (35,44) (45,57) (58,73) (74,93) Spatially sparse CNN: input size 94x94 epoch: 1 CIFAR-10 train set Mistakes:79.3582% NLL:2.11228 MegaMultiplyAdds/sample:820 time:93s GigaMultiplyAdds/s:437 rate:533/s epoch: 2 Segmentation fault (core dumped)

btgraham commented 8 years ago

Hello. Please try pulling the latest version. If it still Seg faults, please change "-03" in Makefile to "-O0 -g" (twice), run "make clean", "make cifar10" and run in gdb using "gdb cifar10 -ex run" and then "bt" to see where it is crashing. Best Ben

hokchhaytann commented 8 years ago

Hi Prof. Graham, it's working now. I finished training with 410 epochs with no problem. Thank you.

btgraham / SparseConvNet-archived

CIFAR10 Segmentation fault #4