happynear / caffe-windows

Configure Caffe in one hour for Windows users.
Other
1.32k stars 649 forks source link

Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered #220

Open nerddd opened 7 years ago

nerddd commented 7 years ago

使用的是ms分支,ubuntu14.04 cuda 8.0+cudnn v5,在TITAN X和GTX 1080显卡上train model的时候,均出现如下错误:

I0712 11:19:24.122469 15529 solver.cpp:336] Iteration 0, Testing net (#0) F0712 11:19:24.439040 15529 math_functions.cu:79] Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered Check failure stack trace: @ 0x7fba786e0daa (unknown) @ 0x7fba786e0ce4 (unknown) @ 0x7fba786e06e6 (unknown) @ 0x7fba786e3687 (unknown) @ 0x7fba78fef08a caffe::caffe_gpu_memcpy() @ 0x7fba78fd4159 caffe::SyncedMemory::mutable_gpu_data() @ 0x7fba78fb0f12 caffe::Blob<>::mutable_gpu_data() @ 0x7fba7901ee1c caffe::ScaleLayer<>::Forward_gpu() @ 0x7fba78fda101 caffe::Net<>::ForwardFromTo() @ 0x7fba78fda1f7 caffe::Net<>::Forward() @ 0x7fba78dd5d86 caffe::Solver<>::Test() @ 0x7fba78dd68ce caffe::Solver<>::TestAll() @ 0x7fba78dd93fc caffe::Solver<>::Step() @ 0x7fba78dd970a caffe::Solver<>::Solve() @ 0x40ca90 train() @ 0x4090dd main @ 0x7fba76f2cf45 (unknown) @ 0x409a4d (unknown) @ (nil) (unknown)

其他版本的Caffe都可以使用,所以不知道到底是哪里的问题,谢谢!

happynear commented 7 years ago

看样子是scale layer的forward出现数组越界,但我看了代码好像没什么越界的。。不行你就把scale layer替换成原版试试吧。。

nerddd commented 7 years ago

@happynear 换了之后还是不行额。。。

nerddd commented 7 years ago

是不是math_functions.cu的问题啊?但是用ms分支的Caffe的时候,train其他的模型又是正常的。。。这怎么找原因啊。。

happynear commented 7 years ago

。。。很迷,我也不知道了。

moyans commented 7 years ago

。。。很方,我也遇见了,用model初始化就报该错,不初始化就很正常。。。。

BensonBlack commented 6 years ago

我遇到的和你一样@moyans

sleepwalker2017 commented 6 years ago

楼主 请问下,现在怎么样了?我遇到了同样的问题,也是 math_functions.cu报错的。

lawrencewxj commented 6 years ago

我也遇到了,楼主解决了吗?

xiakj commented 5 years ago

Me too. 我训练mobilenet-v2,按照教程将DepthwiseConvolution层加入并重新编译caffe,训练的时候报这个错误。(训练数据采用HDF格式) 怎么解决?