torch / torch7

http://torch.ch
Other
8.96k stars 2.38k forks source link

How are parameters passed and distributed on the modules #1195

Open dioZ17 opened 5 years ago

dioZ17 commented 5 years ago

This is more of a question for better understanding of Torch rather than an issue but I don't know where else to ask.

I am interested in understanding how Lua passes the parameters of a neural network to each module, particularly in a CNN where the parameters make up the filters/kernels.

bottom, top = split_model(nChannel) bottom = bottom:cuda() top = top:cuda()

The above code creates the model with 9 Sequential layers of SpatialConvolution -> SpatialBatchNormalization -> ReLU

local bottom_param, bottom_grad_param = bottom:getParameters()

Im guessing the above ties the bottom_param and bottom_grad_param to hold the parameters for the model

local params = torch.load(opt.model_param)

The above loads the .t7 file that already contains the pretrained parameters

bottom_param:copy(params)

The above line copies the pretrained parameters to the bottom_param, passing the parameters to the model

My question is in what manner are the parameters passed. Do the first 9 define the 1st 3x3 kernel?

The split_model function is this:

function split_model(nChannel)

local half_padding = 9 local bottom = nn.Sequential() bottom:add(nn.SpatialReflectionPadding(half_padding, half_padding, half_padding, half_padding))

local function ConvBNReLU(nInputPlane, nOutputPlane, kw, kh, pw, ph) pw = pw or 0 ph = ph or 0 bottom:add(nn.SpatialConvolution(nInputPlane, nOutputPlane, kw, kh, 1, 1, pw, ph)) bottom:add(nn.SpatialBatchNormalization(nOutputPlane,1e-3)) bottom:add(nn.ReLU(true)) return bottom end

ConvBNReLU( nChannel,64,3,3) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.5)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4))

bottom:add(nn.SpatialConvolution(64, 64, 3, 3)) bottom:add(nn.SpatialBatchNormalization(64,1e-3))

local top = nn.Sequential() top:add(nn.CMulTable()) top:add(nn.Sum(2))

return bottom, top

end

p.s the code is not mine and is work of mr. Wenjie Luo