tianbaochou / NasUnet

170 stars 45 forks source link

Shouldn't the architectural parameters be excluded in the SearchNetwork.model_optimizer? #26

Open woodywff opened 4 years ago

woodywff commented 4 years ago

In experiment/search_cell.py line 125: self.model_optimizer = optimizer_cls1(self.model.parameters(), **optimizer_params1) in which self.model.parameters() includes the arch parameters self.model.alphas(). Is this model_optimizer meant to update all the parameters?

Thank you :-)

tianbaochou commented 4 years ago

In experiment/search_cell.py line 125: self.model_optimizer = optimizer_cls1(self.model.parameters(), **optimizer_params1) in which self.model.parameters() includes the arch parameters self.model.alphas(). Is this model_optimizer meant to update all the parameters?

Thank you :-)

that's not right, the model_optimizer only update the network parameters, but the arch_optimizer update the architecture parameters (see self.arch_optimizer = optimizer_cls2(self.model.alphas(), **optimizer_params2) ). In a word, they are separated!

woodywff commented 4 years ago

That means the code needs to be modified to get rid of the self.model.alphas() parameters from self.model.parameters() in this 125 line. Right?

ghost commented 4 years ago

@tianbaochou

CHNxindong commented 3 years ago

In experiment/search_cell.py line 125:

self.model_optimizer = optimizer_cls1(self.model.parameters(), **optimizer_params1)

in` which self.model.parameters() includes the arch parameters self.model.alphas().

Is this model_optimizer meant to update all the parameters?

Thank you :-)

@woodywff In my opinion, model_optimizer and arch_optimizer are defined separated.Although

self.model_optimizer = optimizer_cls1(self.model.parameters(), **optimizer_params1)

has a description about self.model.parameters(). But when I enter into the get_optimizer(definition of model_optimizer and arch_optimizer),

def get_optimizer(cfg, phase='searching', optimizer_type='optimizer_model'):

 if cfg[phase][optimizer_type] is None:

    logger.info("Using SGD optimizer")
    return SGD
else:
    opt_name = cfg[phase][optimizer_type]['name']
    if opt_name not in key2opt:
        raise NotImplementedError('Optimizer {} not implemented'.format(opt_name))

    logger.info('Using {} optimizer'.format(opt_name))
    return key2opt[opt_name]

there is no description about parameters.Based on this I think self.model.parameters() in this place has no effect. So model_optimizer updates network weights and arch_ooptimizer updates architecture weights. And when you read the train phase, you can take it that in the backward train loss makes network weights updated, there is nothing about the architecture weights.