Closed kingoflolz closed 8 years ago
Sooo... I actually tried using the CUDA version of TemporalConvolution, and it is slower than wrapping SpatialConvolution. So you can create a TemporalConvolution module like this:
require 'torch'
require 'nn'
local TemporalConvolution2, parent = torch.class('nn.TemporalConvolution2', 'nn.Module')
function TemporalConvolution2:__init(inputFrameSize, outputFrameSize, kW, dW, padW)
parent.__init(self)
self.inputFrameSize = inputFrameSize
self.outputFrameSize = outputFrameSize
self.kW = kW
self.dW = dW or 1
self.padW = padW or 0
self.sconv = nn.SpatialConvolution(inputFrameSize, outputFrameSize, 1, kW, 1, dW, 0, self.padW)
self.weight = self.sconv.weight
self.bias = self.sconv.bias
self.gradWeight = self.sconv.gradWeight
self.gradBias = self.sconv.gradBias
end
function TemporalConvolution2:clearState()
self.sconv:clearState()
parent:clearState()
end
function TemporalConvolution2:updateOutput(input)
assert(input:dim() == 3, 'must provide batched input')
local batchSize = input:size(1)
local numFrames = input:size(2)
local outFrames = numFrames - math.floor(self.kW/2)*2 + 2 * self.padW
if self.kW%2 == 0 then outFrames = outFrames+1 end
input = input:view(batchSize, numFrames, self.inputFrameSize, 1):transpose(2,3)
local output = self.sconv:updateOutput(input):transpose(2,3)
self.output:resize(batchSize, outFrames, self.outputFrameSize):copy(output)
return self.output
end
function TemporalConvolution2:updateGradInput(input, gradOutput)
assert(input:dim() == 3, 'must provide batched input')
local batchSize = input:size(1)
local numFrames = input:size(2)
local outFrames = numFrames - math.floor(self.kW/2)*2 + 2 * self.padW
if self.kW%2 == 0 then outFrames = outFrames+1 end
input = input:view(batchSize, numFrames, self.inputFrameSize, 1):transpose(2,3)
gradOutput = gradOutput:view(batchSize, outFrames, self.outputFrameSize, 1):transpose(2,3)
local gradInput = self.sconv:updateGradInput(input, gradOutput):transpose(2,3)
self.gradInput:resize(batchSize, numFrames, self.inputFrameSize):copy(gradInput)
return self.gradInput
end
function TemporalConvolution2:accGradParameters(input, gradOutput, scale)
assert(input:dim() == 3, 'must provide batched input')
local batchSize = input:size(1)
local numFrames = input:size(2)
local outFrames = numFrames - math.floor(self.kW/2)*2 + 2 * self.padW
if self.kW%2 == 0 then outFrames = outFrames+1 end
input = input:view(batchSize, numFrames, self.inputFrameSize, 1):transpose(2,3)
gradOutput = gradOutput:view(batchSize, outFrames, self.outputFrameSize, 1):transpose(2,3)
self.sconv:accGradParameters(input, gradOutput, scale)
end
SpatialPooling is trivial to use in Temporal mode, by simply setting one pooling size to 1
, like nn.SpatialMaxPooling(1, kernelSize)
Thanks!
Cool. Maybe I'll add this class in, in case it's useful to others.
Well,that's interesting. I made a quick test, and it fails in a really strange way:
local batchSize = 1
local inFeatures = 1
local outFeatures = 7
local sentenceLength = 3
local kernelSize = 3
-- local stride = 1
local input = torch.ClTensor(batchSize, sentenceLength, inFeatures):uniform()
local net = nn.TemporalConvolution2(inFeatures, outFeatures, kernelSize)
net:cl()
local weights = net.weight
weights:uniform(-1.0, 1.0)
net.bias:zero() -- simplify test for now...
local output = net:forward(input)
-- calc 'by hand' to check
print('weights:size()', weights:size())
print('output:size()', output:size())
local outLength = sentenceLength - math.floor(kernelSize / 2) * 2
local ourOut = torch.FloatTensor(batchSize, outLength, outFeatures):zero()
-- each batch item is independent, calculated separately from others
for b=1,batchSize do
-- each output feature is independnet from other outputs
for outFeature=1,outFeatures do
-- each output point along outS dimensino is indepdnent from other outputs
for outS=1,outLength do
local sum = 0
-- convolve is sum over kernel size, and over the input features
for k=1,kernelSize do
local inS = outS + (k - 1)
for inFeature=1,inFeatures do
local weight = weights[outFeature][inFeature][k][1]
sum = sum + weight * input[b][inS][inFeature]
end
end
ourOut[b][outS][outFeature] = sum
end
end
end
print('output[1]')
print(output[1])
print('ourOut[1]')
print(ourOut[1])
print('output[1] - ourOut[1]')
print(output[1]:float() - ourOut[1])
mytester:assertlt((output:float() - ourOut):abs():max(), 0.0001)
Output:
output[1]
0.5121 -0.1167 0.1658 -0.1018 1.5657 0.1996 -1.1964
[torch.ClTensor of size 1x7]
ourOut[1]
0.5121 -0.1167 0.1658 -0.1018 0.5153 0.1996 -1.1964
[torch.FloatTensor of size 1x7]
output[1] - ourOut[1]
0.0000 0.0000 -0.0000 0.0000 1.0505 0.0000 0.0000
[torch.FloatTensor of size 1x7]
For some reason, if outFeatures is 7, the fifth output feature is wrong. For other values of outFeatures, like 1,2,3,4,5,6, it works ok.
created an issue for this strange behavior https://github.com/hughperkins/clnn/issues/37
Added TemporalConvolution2, and failing unit test for SpatialConvolution 53f3a3f
(Plausibly fixed in a548012 )
Added backwards tests, in aa2a21dc3 , and added to README in 53706045
Is a temporal convolutions/pooling implementation coming in the near future? This will allow me to accelerate my model using the GPU.
Thanks