This is more of a question for better understanding of Torch rather than an issue but I don't know where else to ask.
I am interested in understanding how Lua passes the parameters of a neural network to each module, particularly in a CNN where the parameters make up the filters/kernels.
bottom, top = split_model(nChannel)bottom = bottom:cuda()top = top:cuda()
The above code creates the model with 9 Sequential layers of SpatialConvolution -> SpatialBatchNormalization -> ReLU
local bottom_param, bottom_grad_param = bottom:getParameters()
Im guessing the above ties the bottom_param and bottom_grad_param to hold the parameters for the model
local params = torch.load(opt.model_param)
The above loads the .t7 file that already contains the pretrained parameters
bottom_param:copy(params)
The above line copies the pretrained parameters to the bottom_param, passing the parameters to the model
My question is in what manner are the parameters passed. Do the first 9 define the 1st 3x3 kernel?
The split_model function is this:
function split_model(nChannel)
local half_padding = 9local bottom = nn.Sequential()bottom:add(nn.SpatialReflectionPadding(half_padding, half_padding, half_padding, half_padding))
local function ConvBNReLU(nInputPlane, nOutputPlane, kw, kh, pw, ph) pw = pw or 0 ph = ph or 0 bottom:add(nn.SpatialConvolution(nInputPlane, nOutputPlane, kw, kh, 1, 1, pw, ph)) bottom:add(nn.SpatialBatchNormalization(nOutputPlane,1e-3)) bottom:add(nn.ReLU(true)) return bottom end
This is more of a question for better understanding of Torch rather than an issue but I don't know where else to ask.
I am interested in understanding how Lua passes the parameters of a neural network to each module, particularly in a CNN where the parameters make up the filters/kernels.
bottom, top = split_model(nChannel)
bottom = bottom:cuda()
top = top:cuda()
The above code creates the model with 9 Sequential layers of SpatialConvolution -> SpatialBatchNormalization -> ReLU
local bottom_param, bottom_grad_param = bottom:getParameters()
Im guessing the above ties the bottom_param and bottom_grad_param to hold the parameters for the model
local params = torch.load(opt.model_param)
The above loads the .t7 file that already contains the pretrained parameters
bottom_param:copy(params)
The above line copies the pretrained parameters to the bottom_param, passing the parameters to the model
My question is in what manner are the parameters passed. Do the first 9 define the 1st 3x3 kernel?
The split_model function is this:
function split_model(nChannel)
local half_padding = 9
local bottom = nn.Sequential()
bottom:add(nn.SpatialReflectionPadding(half_padding, half_padding, half_padding, half_padding))
local function ConvBNReLU(nInputPlane, nOutputPlane, kw, kh, pw, ph) pw = pw or 0 ph = ph or 0 bottom:add(nn.SpatialConvolution(nInputPlane, nOutputPlane, kw, kh, 1, 1, pw, ph)) bottom:add(nn.SpatialBatchNormalization(nOutputPlane,1e-3)) bottom:add(nn.ReLU(true)) return bottom end
ConvBNReLU( nChannel,64,3,3) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.5)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4)) ConvBNReLU(64,64,3,3)--:add(nn.Dropout(0.4))
bottom:add(nn.SpatialConvolution(64, 64, 3, 3)) bottom:add(nn.SpatialBatchNormalization(64,1e-3))
local top = nn.Sequential() top:add(nn.CMulTable()) top:add(nn.Sum(2))
return bottom, top
end
p.s the code is not mine and is work of mr. Wenjie Luo