vlfeat / matconvnet

MatConvNet: CNNs for MATLAB
Other
1.4k stars 753 forks source link

Multi-GPUs #101

Closed kjw0612 closed 9 years ago

kjw0612 commented 9 years ago

Hi, all. Is it straightforward to modify matconvnet to fully-utilize multiple GPUs? I want to do something similar to the following paper.

K. Simonyan, A. Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition
http://arxiv.org/pdf/1409.1556

lenck commented 9 years ago

Hi, unfortunatelly it is not that easy task if you want to have a decent performance, but we are working on it :) (to be precise, Andrea is working on it ;))

vedaldi commented 9 years ago

Not that bad actually, although perhaps a little suboptimal due to how MATLAB handles multiple GPUs. I can push a developer branch with the multi-GPU training example probably later today.

On 26 Mar 2015, at 11:58, Karel Lenc notifications@github.com wrote:

Hi, unfortunatelly it is not that easy task if you want to have a decent performance, but we are working on it :) (to be precise, Andrea is working on it ;))

— Reply to this email directly or view it on GitHub https://github.com/vlfeat/matconvnet/issues/101#issuecomment-86483760.

kjw0612 commented 9 years ago

That is awesome!

AnanS commented 9 years ago

@vedaldi

Sounds great, looking forward to it!

Edit: Works great, thanks!

vedaldi commented 9 years ago

The branch is available now, by the way. It is called mgpu.

The only difference is a new training file cnn_imagenet_train_mgpu.m and a corresponding cnn_train_mgpu. Not much documentation so far unfortunately. We would be interested in knowing whether this works for people.

On 26 Mar 2015, at 15:28, AnanS notifications@github.com wrote:

@vedaldi https://github.com/vedaldi Sounds great, looking forward to it!

— Reply to this email directly or view it on GitHub https://github.com/vlfeat/matconvnet/issues/101#issuecomment-86540616.

JieSong89 commented 8 years ago

@vedaldi Is the mutiple gpu version still available? I could not find it any more.

Thanks,

vedaldi commented 8 years ago

Hi, now the master version of MatConvNet supports multiple GPUs.

On 12 Jan 2016, at 12:38, JieSong89 notifications@github.com wrote:

@vedaldi https://github.com/vedaldi Is the mutiple gpu version still available? I could not find it any more.

Thanks,

— Reply to this email directly or view it on GitHub https://github.com/vlfeat/matconvnet/issues/101#issuecomment-170899270.

JieSong89 commented 8 years ago

@vedaldi Thanks a lot!

JieSong89 commented 8 years ago

@vedaldi Sorry to disturb again. I am checking around for some DAG building examples(e.g. GoogleNet) and it seems there is no official guidance for building customized DAG? Or it is just very intuitive to build one based on DagNN? Thanks!

vedaldi commented 8 years ago

Hi,

while the interface is perhaps not as succinct as we would like (we are looking into this), it is really not too difficult. I am sharing an example on how to build googlenet that I am experimenting with (note that the training performance is still below what it should be as the parameters need to be optimised).

The example is pretty complex (as it constructs inception-v3) and uses a few tricks to make the code more succinct (see the “stack” variable), but it should give you the general idea.

Andrea

function net = cnn_imagenet_init_inception(varargin)

opts.scale = 1 ;
opts.initBias = 0.1 ;
opts.weightDecay = 1 ;
opts.cudnnWorkspaceLimit = 1024*1024*1204 ; % 1GB
opts = vl_argparse(opts, varargin) ;

net = dagnn.DagNN() ;

net.meta.inputSize = [299 299 3 1] ;
net.meta.normalization.imageSize = net.meta.inputSize ;

stack = {} ;

  function dup()
    stack{end+1} = stack{end} ;
  end

  function swap()
    stack([end-1 end]) = stack([end end-1]) ;
  end

  function Conv(name, ksize, out, varargin)
    copts.stride = [1 1] ;
    copts.pad = (ksize-1)/2 ;
    copts = vl_argparse(copts, varargin) ;
    if isempty(stack)
      inputVar = 'input' ;
      in = 3 ;
    else
      prev = stack{end} ;
      stack(end) = [] ;
      i = net.getLayerIndex(prev) ;
      inputVar = net.layers(i).outputs{1} ;
      sizes = net.getVarSizes({'input', net.meta.inputSize}) ;
      j = net.getVarIndex(inputVar) ;
      in = sizes{j}(3) ;
    end
    if numel(ksize) == 1, ksize = [ksize ksize] ; end
    net.addLayer(name , ...
      dagnn.Conv('size', [ksize in out], ...
      'stride', copts.stride, ....
      'pad', copts.pad, ...
      'opts', {'cudnnworkspacelimit', opts.cudnnWorkspaceLimit}), ...
      inputVar, ...
      [name '_conv'], ...
      {[name '_f'], [name '_b']}) ;
    net.addLayer([name '_bn'], ...
      dagnn.BatchNorm('numChannels', out), ...
      [name '_conv'], ...
      [name '_bn'], ...
      {[name '_bn_w'], [name '_bn_b'], [name '_bn_m']}) ;
    net.addLayer([name '_relu'] , ...
      dagnn.ReLU(), ...
      [name '_bn'], ...
      name) ;
    stack{end+1} = [name '_relu'] ;
  end

  function Pool(name, ksize, varargin)
    copts.stride = [1 1] ;
    copts.pad = (ksize-1)/2 ;
    copts.method = 'max' ;
    copts = vl_argparse(copts, varargin) ;

    prev = stack{end} ;
    stack(end) = [] ;
    i = net.getLayerIndex(prev) ;
    inputVar = net.layers(i).outputs{1} ;

    if numel(ksize) == 1, ksize = [ksize ksize] ; end
    net.addLayer(name , ...
      dagnn.Pooling('poolSize', ksize, ...
      'method', copts.method, ...
      'stride', copts.stride, ....
      'pad', copts.pad), ...
      inputVar, ...
      [name '_pool']) ;
    stack{end+1} = name ;
  end

  function Concat(name, num)
    inputVars = {} ;
    for layer = stack(end-num+1:end)
      prev = char(layer) ;
      i = net.getLayerIndex(prev) ;
      inputVars{end+1} = net.layers(i).outputs{1} ;
    end
    stack(end-num+1:end) = [] ;
    net.addLayer(name , ...
      dagnn.Concat(), ...
      inputVars, ...
      name) ;
    stack{end+1} = name ;
  end

  function Pred(name, out, varargin)
    prev = stack{end} ;
    stack(end) = [] ;
    i = net.getLayerIndex(prev) ;
    inputVar = net.layers(i).outputs{1} ;
    sizes = net.getVarSizes({'input', net.meta.inputSize}) ;
    j = net.getVarIndex(inputVar) ;
    in = sizes{j}(3) ;

    net.addLayer([name '_dropout'] , ...
      dagnn.DropOut('rate', 0.2), ...
      inputVar, ...
      [name '_dropout']) ;

    net.addLayer(name, ...
      dagnn.Conv('size', [1 1 in out]), ...
      [name '_dropout'], ...
      name, ...
      {[name '_f'], [name '_b']}) ;

    net.addLayer([name '_loss'], ...
      dagnn.Loss('loss', 'softmaxlog'), ...
      {name, 'label'}, ...
      [name '_loss']) ;

    net.addLayer([name '_top1error'], ...
      dagnn.Loss('loss', 'classerror'), ...
      {name, 'label'}, ...
      [name '_top1error']) ;

    net.addLayer([name '_top5error'], ...
      dagnn.Loss('loss', 'topkerror', 'opts', {'topK', 5}), ...
      {name, 'label'}, ...
      [name '_top5error']) ;
  end

% Pre-inception
Conv('conv', 3, 32, 'stride', 2, 'pad', 0) ;
Conv('conv1', 3, 32, 'pad', 0) ;
Conv('conv2', 3, 64) ;
Pool('pool', 3, 'stride', 2, 'pad', 0) ;
Conv('conv3', 1, 80, 'pad', 0) ;
Conv('conv4', 3, 192, 'pad', 0) ;
Pool('pool1', 3, 'stride', 2, 'pad', 0) ;

% Inception fig. 5 x 3
for t = 1:3
  pfx = sprintf('inception5_%d', t) ;
  dup() ;
  Conv([pfx '_a1'], 1, 64) ;
  swap() ; dup() ;
  Conv([pfx '_b1'], 1, 48) ;
  Conv([pfx '_b2'], 5, 64) ;
  swap() ; dup() ;
  Conv([pfx '_c1'], 1, 64) ;
  Conv([pfx '_c2'], 3, 96) ;
  Conv([pfx '_c3'], 3, 96) ;
  swap() ;
  Pool([pfx '_d1'], 3, 'method', 'avg') ;
  Conv([pfx '_d2'], 1, 64) ;
  Concat(pfx, 4) ;
end

% Inception fig. 5 down
pfx = 'inception5_4' ;
dup() ;
Conv([pfx '_a1'], 3, 384, 'stride', 2, 'pad', 0) ;
swap() ; dup() ;
Conv([pfx '_b1'], 1, 64) ;
Conv([pfx '_b2'], 3, 96) ;
Conv([pfx '_b3'], 3, 96, 'stride', 2, 'pad', 0) ;
swap() ;
Pool([pfx '_c1'], 3, 'method', 'max', 'stride', 2, 'pad', 0) ;
Concat(pfx, 3) ;

% Inpcetion fig. 6 x 4
for t = 1:4
  pfx = sprintf('inception6_%d', t) ;
  dup() ;
  Conv([pfx '_a1'], 1, 192) ;
  swap() ; dup() ;
  Conv([pfx '_b1'], 1, 160) ;
  Conv([pfx '_b2'], [1 7], 160) ;
  Conv([pfx '_b3'], [7 1], 192) ;
  swap() ; dup() ;
  Conv([pfx '_c1'], 1, 160) ;
  Conv([pfx '_c2'], [7 1], 160) ;
  Conv([pfx '_c3'], [1 7], 160) ;
  Conv([pfx '_c4'], [7 1], 160) ;
  Conv([pfx '_c5'], [1 7], 192) ;
  swap() ;
  Pool([pfx '_d1'], 3, 'method', 'avg') ;
  Conv([pfx '_d2'], 1, 192) ;
  Concat(pfx, 4) ;
end

% Inception fig. 6 down
pfx = 'inception6_5' ;
dup() ;
Conv([pfx '_a1'], 1, 192) ;
Conv([pfx '_a2'], 3, 320, 'stride', 2, 'pad', 0) ;
swap() ; dup() ;
Conv([pfx '_b1'], 1, 192) ;
Conv([pfx '_b2'], [1 7], 192) ;
Conv([pfx '_b3'], [7 1], 192) ;
Conv([pfx '_b4'], 3, 192, 'stride', 2, 'pad', 0) ;
swap() ;
Pool([pfx '_c1'], 3, 'method', 'max', 'stride', 2, 'pad', 0) ;
Concat(pfx, 3) ;

% Inception fig. 7 x 2
for t = 1:2
  pfx = sprintf('inception7_%d',t) ;
  dup() ;
  Conv([pfx '_a1'], 1, 320) ;
  swap() ; dup() ;
  Conv([pfx '_b1'], 1, 384) ;
  Conv([pfx '_b2'], [1 3], 384) ;
  Conv([pfx '_b3'], [3 1], 384) ;
  swap() ; dup() ;
  Conv([pfx '_c1'], 1, 448) ;
  Conv([pfx '_c2'], 3, 384) ;
  Conv([pfx '_c3'], [1 3], 384) ;
  Conv([pfx '_c4'], [3 1], 384) ;
  swap() ;
  Pool([pfx '_d1'], 3, 'method', 'avg') ;
  Conv([pfx '_d2'], 1, 192) ;
  Concat(pfx, 4) ;
end

% Average pooling and loss
Pool('pool_2', 8, 'method', 'avg', 'pad', 0) ;
Pred('prediction', 1000) ;

% Meta parameters
net.meta.inputSize = net.meta.normalization.imageSize ;
net.meta.normalization.border = 310 - net.meta.normalization.imageSize(1:2) ;
net.meta.normalization.interpolation = 'bicubic' ;
net.meta.normalization.averageImage = [] ;
net.meta.normalization.keepAspect = true ;
net.meta.augmentation.rgbVariance = zeros(0,3) ;
net.meta.augmentation.transformation = 'stretch' ;

lr = logspace(-1, -4, 20) ;
lr = 0.045 * ones(1,100) ;
net.meta.trainOpts.learningRate = lr ;
net.meta.trainOpts.numEpochs = numel(lr) ;
net.meta.trainOpts.batchSize = 32 * 2 ;
net.meta.trainOpts.weightDecay = 0.0005 ;

% Init parameters randomly
net.initParams() ;
end

On 18 Jan 2016, at 16:05, JieSong89 notifications@github.com wrote:

@vedaldi https://github.com/vedaldi Sorry to disturb again. I am checking around for some DAG building examples(e.g. GoogleNet) and it seems there is no official guidance for building customized DAG? Or it is just very intuitive to build one based on DagNN? Thanks!

— Reply to this email directly or view it on GitHub https://github.com/vlfeat/matconvnet/issues/101#issuecomment-172571089.

JieSong89 commented 8 years ago

@vedaldi Big thx and I will dig into it.

angy50 commented 7 years ago

does mgpu also able to support multi-gpus on distributed systems using LAN/wifi connection?