NVIDIA / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
672 stars 263 forks source link

When I run net on other gpu which is not 0, an CuDNNConvolutionLayer Allocate error will happen #526

Closed HaoLiuHust closed 5 years ago

HaoLiuHust commented 6 years ago

I want to use nvcaffe in gRPC, but when call net.Forward, the program will faill in caffe::GPUMemory::Workspace::safe_reserve, the call stack is:
caffe::GPUMemory::Workspace::safe_reserve(unsigned long, int) 0x00007fffe3814a5e caffe::CuDNNConvolutionLayer<float, float>::AllocateWorkspace(unsigned long) 0x00007fffe348a277 caffe::CuDNNConvolutionLayer<float, float>::Reshape(std::vector<caffe::Blob, std::allocator<caffe::Blob> > const&, std::vector<caffe::Blob, std::allocator<caffe::Blob> > const&) 0x00007fffe348fde7 caffe::Layer<float, float>::Forward(std::vector<caffe::Blob, std::allocator<caffe::Blob> > const&, std::vector<caffe::Blob, std::allocator<caffe::Blob> > const&) 0x00007fffe3388dda caffe::Net::ForwardFromTo(int, int) 0x00007fffe378428b caffe::Net::Forward(float*) 0x00007fffe3784426

HaoLiuHust commented 6 years ago

Any one can help?

drnikolaev commented 6 years ago

@HaoLiuHust do you have display connected?

HaoLiuHust commented 6 years ago

@drnikolaev I am connecting the server with vnc, and if I run the model on gpu 0, it will be ok. The error only occur when I want to run it on other gpus

drnikolaev commented 6 years ago

@HaoLiuHust could you please check this rc https://github.com/drnikolaev/caffe/tree/caffe-0.17

HaoLiuHust commented 6 years ago

@drnikolaev Thank you very much, I will try it

drnikolaev commented 6 years ago

Please verify https://github.com/NVIDIA/caffe/tree/v0.17.1