could not use more than one gpu

feichtenhofer / twostreamfusion

Code release for "Convolutional Two-Stream Network Fusion for Video Action Recognition", CVPR 2016.

http://www.robots.ox.ac.uk/~vgg/software/two_stream_action/

Other

716 stars 188 forks source link

could not use more than one gpu #22

Open AndyTang15 opened 8 years ago

AndyTang15 commented 8 years ago

hi, when I use one gpu, the code works well. But when I change opts.train.gpus = 1; (in cnn_ucf101_temporal.m) to opts.train.gpus = [1,2]; it was fail:

Error using cnn_train_dag (line 120) Error detected on worker 1.

Error in cnn_ucf101_temporal (line 231) [info] = cnn_train_dag(net, imdb, fn, opts.train) ;

Caused by: Error using cnn_train_dag>map_gradients (line 503) Invalid file identifier. Use fopen to generate a valid file identifier.

I've tested opts.train.gpus = [1,2] on the mnisit in Matconvnet example, it works well. anyone could help me?

JasmineeYang commented 7 years ago

Hello,did you run the codes successfully?when I run the code ,such error appears :+1: 错误使用 vl_nnconv Out of memory on device. To view more detail about available memory on the GPU, use 'gpuDevice()'. If the problem persists, reset the GPU by calling 'gpuDevice(1)'. Have you ever met this situation?I want to know where can i change cudnnWorkspaceLimit

JasmineeYang commented 7 years ago

@AndyTang15

icyzhang0923 commented 7 years ago

you need to compile the matconvnet again under the condition more than one gpu~try it~

ml930310 commented 7 years ago

@JasmineeYang I have a same problem ,do you have solved?Thanks.

ml930310 commented 7 years ago

@JasmineeYang

soumitrasamanta commented 6 years ago

@ml930310 I have the same problem when I run the individual networks (spatial or temporal) using multiple GPUs! I found this is due to the existence of folder mentioned in "opts.train.memoryMapFile = fullfile(tempdir, 'ramdisk', ['matconvnet' num2str(feature('getpid')) '.bin']) ;" in both cnn_ucf101_spatial.m & cnn_ucf101_temporal.m files and solved this by creating the folder mentioned in "opts.train.memoryMapFile".

ml930310 commented 6 years ago

sorry,I haven't done this for a long time, I forgot it.

At 2018-08-10 22:42:54, "SS" notifications@github.com wrote:

@ml930310 I have the same problem when I run the individual networks (spatial or temporal) using multiple GPUs! I found this is due to the existence of folder mentioned in "opts.train.memoryMapFile = fullfile(tempdir, 'ramdisk', ['matconvnet' num2str(feature('getpid')) '.bin']) ;" in both cnn_ucf101_spatial.m & cnn_ucf101_temporal.m files and solved this by addition adding one line which create the folder mentioned in "opts.train.memoryMapFile".

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.