mikeizbicki / cmc-csci181-deeplearning

deep learning course materials
15 stars 6 forks source link

optimizer got an empty parameter list #7

Open xxing21 opened 4 years ago

xxing21 commented 4 years ago

Hi! I hope everyone is well. I have some trouble running CNN model for #9 in hw6. The error says

Traceback (most recent call last):
  File "names.py", line 178, in <module>
    weight_decay=args.weight_decay
  File "//anaconda3/lib/python3.7/site-packages/torch/optim/sgd.py", line 64, in __init__
    super(SGD, self).__init__(params, defaults)
  File "//anaconda3/lib/python3.7/site-packages/torch/optim/optimizer.py", line 46, in __init__
    raise ValueError("optimizer got an empty parameter list")
ValueError: optimizer got an empty parameter list

My code for CNN class is:

class CNNModel(nn.Module):
    def _init_(self):
        super(CNNModel,self)._init_()
        self.relu = nn.RELU()
        self.cnn = nn.Conv1d(len(vocabulary),args.hidden_layer_size,3,padding=1)
        self.cnns = (args.num_layers-1)*[nn.Conv1d(args.hidden_layer_size,args.hidden_layer_size,3,padding=1)]
        self.fc = nn.Linear(args.hidden_layer_size*args.input_length,len(all_categories))

    def forward(self, x):
        out = torch.einsum('lbv->bvl', x)
        out = self.cnn(out)
        out = self.relu(out)
        for cnn in self.cnns:
            out = cnn(out)
            out = self.relu(out)
        out = out.view(args.batch_size,args.hidden_layer_size*args.input_length)
        out = self.fc(out)
        return out

and for SGD I have:

    if args.optimizer == 'sgd':
        optimizer = torch.optim.SGD(
                model.parameters(),
                lr=args.learning_rate,
                momentum=args.momentum,
                weight_decay=args.weight_decay
                )

Would appreciate any insight on how to fix this error. Thanks!

mikeizbicki commented 4 years ago

This is an excellent example of asking a question! The text clearly explains the problem, and the full error message and the relevant sections of code are given and correctly formatted. This makes it very easy for me to quickly look at the code and help figure out what the problem is.


Now to answer the question. The "parameter list" referenced in the error message is the first parameter that gets passed to the optimizer. In your code, this would be model.parameters(). There's two possibilities about what's going on:

  1. model.parameters() may be returning an empty list. If model is really an instance of CNNModel, this seems unlikely because you are in fact defining parameters in that class. So check to see if this is an empty list, and if it is then probably model is not a CNNModel for some reason.
  2. You are not actually calling the torch.optim.SGD optimizer, but calling some other optimizer.
xxing21 commented 4 years ago

I define the model in two classes:

class Model(nn.Module):
    def __init__(self):
        super(Model,self).__init__()
        if args.model == 'rnn':
            self.rnn = nn.RNN(len(vocabulary),hidden_size=args.hidden_layer_size,num_layers=args.num_layers)
        if args.model == 'gru':
            self.rnn = nn.GRU(len(vocabulary),hidden_size=args.hidden_layer_size,num_layers=args.num_layers)
        if args.model == 'lstm':
            self.rnn = nn.LSTM(len(vocabulary),hidden_size=args.hidden_layer_size,num_layers=args.num_layers)
        self.output = nn.Linear(args.hidden_layer_size,n_categories)

    def forward(self, x):
        out,h_n = self.rnn(x)
        out = self.output(out[out.shape[0]-1,:,:])
        return out

class CNNModel(nn.Module):
    def _init_(self):
        super(CNNModel,self)._init_()
        self.relu = nn.RELU()
        self.cnn = nn.Conv1d(len(vocabulary),args.hidden_layer_size,3,padding=1)
        self.cnns = (args.num_layers-1)*[nn.Conv1d(args.hidden_layer_size,args.hidden_layer_size,3,padding=1)]
        self.fc = nn.Linear(args.hidden_layer_size*args.input_length,len(all_categories))

    def forward(self, x):
        out = torch.einsum('lbv->bvl', x)
        out = self.cnn(out)
        out = self.relu(out)
        for cnn in self.cnns:
            out = cnn(out)
            out = self.relu(out)
        out = out.view(args.batch_size,args.hidden_layer_size*args.input_length)
        out = self.fc(out)
        return out

and when I load the model, I use:

if args.model == 'cnn':
    model = CNNModel()
else:
    model = Model()

It works well when I run the command for RNN

$ python3 names.py --train --model=rnn --gradient_clipping --optimizer=sgd --learning_rate=1e-1 --batch_size=10

but it raises the ValueError: optimizer got an empty parameter list when I run the exactly same command for CNN

$ python3 names.py --train --model=cnn --gradient_clipping --optimizer=sgd --learning_rate=1e-1 --batch_size=10
zhh1997zhh commented 4 years ago

I have the same issue. Is the problem related to the code attached below (when we prepare the model for training)?

criterion = nn.CrossEntropyLoss()
    if args.optimizer == 'sgd':
        optimizer = torch.optim.SGD(
                model.parameters(),
                lr=args.learning_rate,
                momentum=args.momentum,
                weight_decay=args.weight_decay
                )

    if args.optimizer == 'adam':
     optimizer = torch.optim.Adam(
             model.parameters(),
             lr=args.learning_rate,
             weight_decay=args.weight_decay
             )

    if args.optimizer == 'adagrad':
     optimizer = torch.optim.Adagrad(
             model.parameters(),
             lr=args.learning_rate,
             weight_decay=args.weight_decay
             )

    model.train()
mikeizbicki commented 4 years ago

We need to verify that the model.parameters() function is in fact empty, and determine what is causing it to be empty. For example, if you were to add the line

print('model.parameters()=',model.parameters())

immediately after you define the model variable in your code snippet above, do you get the output like

model.parameters()=[]

or does it contain something?

xxing21 commented 4 years ago

I added print('model.parameters()=',model.parameters()) and it gives

model.parameters()= <generator object Module.parameters at 0x144174a98>
mikeizbicki commented 4 years ago

@xxing21 A generator is a type of "container" that doesn't let you inspect the elements inside of it. To see what's inside, we need to convert it into a list. Try this code instead:

print('model.parameters()=',list(model.parameters()))
mikeizbicki commented 4 years ago

@xxing21 Looking back at your CNNModel class, I believe I see an error. The _init_ function should have 2 underscores on each side, not 1. Thus it should be __init__.

In cs46 this coming week we'll be covering exactly what this function does and why it's named that way.

mikeizbicki commented 4 years ago

@zhh1997zhh The code you pasted looks correct. I believe the error is related to the message about __init__ I just sent above.

xxing21 commented 4 years ago

@mikeizbicki It was indeed an empty list before, but it works well now after I changed to __init__. Thank you so much!!

zhh1997zhh commented 4 years ago

@mikeizbicki I fixed an error in my __init__ function and it works now. Thank you very much!