hughperkins / clnn

OpenCL backend for Torch nn neural networks library
BSD 2-Clause "Simplified" License
125 stars 16 forks source link

temporal convolutions/pooling #36

Closed kingoflolz closed 8 years ago

kingoflolz commented 8 years ago

Is a temporal convolutions/pooling implementation coming in the near future? This will allow me to accelerate my model using the GPU.

Thanks

hughperkins commented 8 years ago

Sooo... I actually tried using the CUDA version of TemporalConvolution, and it is slower than wrapping SpatialConvolution. So you can create a TemporalConvolution module like this:

require 'torch'
require 'nn'

local TemporalConvolution2, parent = torch.class('nn.TemporalConvolution2', 'nn.Module')

function TemporalConvolution2:__init(inputFrameSize, outputFrameSize, kW, dW, padW)
   parent.__init(self)

  self.inputFrameSize = inputFrameSize
  self.outputFrameSize = outputFrameSize
  self.kW = kW
  self.dW = dW or 1
  self.padW = padW or 0
  self.sconv = nn.SpatialConvolution(inputFrameSize, outputFrameSize, 1, kW, 1, dW, 0, self.padW)
  self.weight = self.sconv.weight
  self.bias = self.sconv.bias
  self.gradWeight = self.sconv.gradWeight
  self.gradBias = self.sconv.gradBias
end

function TemporalConvolution2:clearState()
  self.sconv:clearState()
  parent:clearState()
end

function TemporalConvolution2:updateOutput(input)
  assert(input:dim() == 3, 'must provide batched input')
  local batchSize = input:size(1)
  local numFrames = input:size(2)
  local outFrames = numFrames - math.floor(self.kW/2)*2 + 2 * self.padW
  if self.kW%2 == 0 then outFrames = outFrames+1 end

  input = input:view(batchSize, numFrames, self.inputFrameSize, 1):transpose(2,3)
  local output = self.sconv:updateOutput(input):transpose(2,3)
  self.output:resize(batchSize, outFrames, self.outputFrameSize):copy(output)
  return self.output
end

function TemporalConvolution2:updateGradInput(input, gradOutput)
  assert(input:dim() == 3, 'must provide batched input')
  local batchSize = input:size(1)
  local numFrames = input:size(2)
  local outFrames = numFrames - math.floor(self.kW/2)*2 + 2 * self.padW
  if self.kW%2 == 0 then outFrames = outFrames+1 end

  input = input:view(batchSize, numFrames, self.inputFrameSize, 1):transpose(2,3)
  gradOutput = gradOutput:view(batchSize, outFrames, self.outputFrameSize, 1):transpose(2,3)
  local gradInput = self.sconv:updateGradInput(input, gradOutput):transpose(2,3)
  self.gradInput:resize(batchSize, numFrames, self.inputFrameSize):copy(gradInput)

  return self.gradInput
end

function TemporalConvolution2:accGradParameters(input, gradOutput, scale)
  assert(input:dim() == 3, 'must provide batched input')
  local batchSize = input:size(1)
  local numFrames = input:size(2)
  local outFrames = numFrames - math.floor(self.kW/2)*2 + 2 * self.padW
  if self.kW%2 == 0 then outFrames = outFrames+1 end

  input = input:view(batchSize, numFrames, self.inputFrameSize, 1):transpose(2,3)
  gradOutput = gradOutput:view(batchSize, outFrames, self.outputFrameSize, 1):transpose(2,3)
  self.sconv:accGradParameters(input, gradOutput, scale)  
end

SpatialPooling is trivial to use in Temporal mode, by simply setting one pooling size to 1, like nn.SpatialMaxPooling(1, kernelSize)

kingoflolz commented 8 years ago

Thanks!

hughperkins commented 8 years ago

Cool. Maybe I'll add this class in, in case it's useful to others.

hughperkins commented 8 years ago

Well,that's interesting. I made a quick test, and it fails in a really strange way:

  local batchSize = 1
  local inFeatures = 1
  local outFeatures = 7
  local sentenceLength = 3
  local kernelSize = 3
--  local stride = 1
  local input = torch.ClTensor(batchSize, sentenceLength, inFeatures):uniform()
  local net = nn.TemporalConvolution2(inFeatures, outFeatures, kernelSize)
  net:cl()
  local weights = net.weight
  weights:uniform(-1.0, 1.0)
  net.bias:zero()  -- simplify test for now...
  local output = net:forward(input)
  -- calc 'by hand' to check
  print('weights:size()', weights:size())
  print('output:size()', output:size())
  local outLength = sentenceLength - math.floor(kernelSize / 2) * 2
  local ourOut = torch.FloatTensor(batchSize, outLength, outFeatures):zero()
  -- each batch item is independent, calculated separately from others
  for b=1,batchSize do
    -- each output feature is independnet from other outputs
    for outFeature=1,outFeatures do
      -- each output point along outS dimensino is indepdnent from other outputs
      for outS=1,outLength do
        local sum = 0
        -- convolve is sum over kernel size, and over the input features
        for k=1,kernelSize do
          local inS = outS + (k - 1)
          for inFeature=1,inFeatures do
            local weight = weights[outFeature][inFeature][k][1]
            sum = sum + weight * input[b][inS][inFeature]
          end
        end
        ourOut[b][outS][outFeature] = sum
      end
    end
  end
  print('output[1]')
  print(output[1])
  print('ourOut[1]')
  print(ourOut[1])
  print('output[1] - ourOut[1]')
  print(output[1]:float() - ourOut[1])
  mytester:assertlt((output:float() - ourOut):abs():max(), 0.0001)

Output:

output[1]
 0.5121 -0.1167  0.1658 -0.1018  1.5657  0.1996 -1.1964
[torch.ClTensor of size 1x7]

ourOut[1]
 0.5121 -0.1167  0.1658 -0.1018  0.5153  0.1996 -1.1964
[torch.FloatTensor of size 1x7]

output[1] - ourOut[1]
 0.0000  0.0000 -0.0000  0.0000  1.0505  0.0000  0.0000
[torch.FloatTensor of size 1x7]

For some reason, if outFeatures is 7, the fifth output feature is wrong. For other values of outFeatures, like 1,2,3,4,5,6, it works ok.

hughperkins commented 8 years ago

created an issue for this strange behavior https://github.com/hughperkins/clnn/issues/37

hughperkins commented 8 years ago

Added TemporalConvolution2, and failing unit test for SpatialConvolution 53f3a3f

hughperkins commented 8 years ago

(Plausibly fixed in a548012 )

hughperkins commented 8 years ago

Added backwards tests, in aa2a21dc3 , and added to README in 53706045