Possible tolX Error Workaround

3DTOPO commented 8 years ago

I have noticed that when using noise for the seed sometimes it fails with the message "function value changing less than tolX" and all I have to do is keep rerunning the same command until it runs - then all is fine.

Further more, I have determined that when this happens, the total loss between the first iteration and the second iteration is 0.

I added some logic to catch it

local function main(params)
    local lastTotalLoss = 0.0  -- declare a variable to keep track of last iteration total loss

around line 299:

print(string.format(' loss from last iteration: %f', (lastTotalLoss - loss)))
if (lastTotalLoss - loss) == 0.0 then
    print('********** 0 loss from last iteration **********')
    -- need to reinitialize image with new noise seed    
end
lastTotalLoss = loss
collectgarbage()
-- optim.lbfgs expects a vector for gradients
return loss, grad:view(grad:nElement())

I tried reinitializing the image like this, and it carried on where it would have stopped with the tolX error, but I get a blank image:

if (lastTotalLoss - loss) == 0.0 then
    print('********** 0 loss from last iteration **********')
    img = initImage(params)
    y = net:forward(img)
    dy = img.new(#y):zero()
    loss = -1
end
lastTotalLoss = loss

I have never used torch before, so I feel a little lost still! Oh yeah, I put the image init stuff in a function so I could reuse it (above):

local function initImage(params)
    -- Initialize the image
    if params.seed >= 0 then
        torch.manualSeed(params.seed)
    end
    local img = nil
    if params.init == 'random' then
        img = torch.randn(content_image:size()):float():mul(0.001)
    elseif params.init == 'image' then
        img = content_image_caffe:clone():float()
    else
        error('Invalid init type')
    end
    if params.gpu >= 0 then
        img = img:cuda()
    end  
    return img
  end

local img = initImage(params)

The whole file is attached.

neural_style.lua.txt

3DTOPO commented 8 years ago

Here is a case where it was detected and no action taken (I didn't override anything for this output - I only added the logic and a printf statement):

...
Setting up style layer      30  :   relu5_1 
WARNING: Skipping content loss  
Running optimization with L-BFGS    
Iteration 1 / 1500  
  Content 1 loss: 841378.984375 
  Style 1 loss: 520660.522461   
  Style 2 loss: 97578859.375000 
  Style 3 loss: 22274044.921875 
  Style 4 loss: 572314812.500000    
  Style 5 loss: 28147.811890    
  Total loss: 693557904.115601  
 loss from last iteration: -693557904.115601    
<optim.lbfgs>   creating recyclable direction/step/history buffers  
Iteration 2 / 1500  
  Content 1 loss: 841378.984375 
  Style 1 loss: 520660.522461   
  Style 2 loss: 97578859.375000 
  Style 3 loss: 22274044.921875 
  Style 4 loss: 572314812.500000    
  Style 5 loss: 28147.811890    
  Total loss: 693557904.115601  
 ********** 0 loss from last iteration **********   
<optim.lbfgs>   function value changing less than tolX

You can see my added printf statement (2nd to last line).

3DTOPO commented 8 years ago

I submitted a pull #97 request for the issue. It carries on now instead of failing with a tolX error for me under these conditions.

jcjohnson / neural-style

Possible tolX Error Workaround #89