pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
83.36k stars 22.48k forks source link

the loss is not decreasing #847

Closed elahehraisi closed 7 years ago

elahehraisi commented 7 years ago

Hi, I have created a simple model consisting of two 1-layer nn competing each other. So, I have my own loss function based on those nn outputs. It is very similar to GAN. The problem is that for a very simple test sample case, the loss function is not decreasing. For now I am using non-stochastic optimizer to eliminate randomness. Here is the pseudo code with explanation

n1_model = Net1(Dimension_in_n1, Dimension_out) # 1-layer nn with sigmoid n2_model =Net2(Dimension_in_n2, Dimension_out) # 1-layer nn with sigmoid

n1_optimizer = torch.optim.LBFGS(n1_model.parameters(), lr=0.01,max_iter = 50) n2_optimizer = torch.optim.LBFGS(n2_model.parameters(), lr=0.01, max_iter = 50)

for t in range(iter): x_n1 = Variable(torch.from_numpy(...)) #load input of nn1 in batch size x_n2 = Variable(torch.from_numpy(...)) #load input of nn2 in batch size

def closure():
   reset_grad(n1_model.parameters())
   reset_grad(n2.parameters())
    y_n1 = n1_model(x_n1)
    y_n2 = n2_model(x_n2)
    n1_params =getParams(n1_model)
    n2_param =getParams(n2_model)
    loss = my_loss_function(y_n1, y_n2, n1_params, n2_param) # my loss function, mean square error + regularizer
    loss.backward(retain_variables=True)
    return loss

n1_optimizer.step(closure)

def clos():
   reset_grad(n1_model.parameters())
   reset_grad(n2_model.parameters())
    y_n1 = n1_model(x_n1)
    y_n2 = n2_model(x_n2)
    n1_params =getParams(n1_model)
    n2_param =getParams(n2_model)
    loss = my_loss_function(y_n1, y_n2, n1_params, n2_param) # my loss function, mean square error + regularizer
    loss.backward()
    return loss

n2_optimizer.step(clos)

and here is the definition of my loss function:

def my_loss_function(n1_output, n2_output, n1_parm, n2_param): sm = torch.pow(n1_output - n2_output, 2) reg = torch.norm(n1_parm,2) + torch.norm(n2_param,2) y = torch.sum(sm) + 1 * reg return y

when I plot loss function, it has oscillation; I expect it to decrease during training.

apaszke commented 7 years ago

Maybe the model is underfitting or there's something wrong with the training procedure. We're using the GitHub issues only for bug reports and feature requests not for general help. If you have any questions, please ask them on our forums, but we can't help you debug any model you have. There are lots of things that can make training unstable, from data loading to exploding/vanishing gradients and numerical instability.