facebookresearch / deepcluster

Deep Clustering for Unsupervised Learning of Visual Features
Other
1.69k stars 325 forks source link

Using seperate optimizer for annexed top layer. #62

Closed AhmadM-DL closed 4 years ago

AhmadM-DL commented 4 years ago

Hello there,

I saw that you are using two optimizers for Deep Cluster. One for the entire network except the top layer and one for the annexed top layer every cycle.

    # create an optimizer for the last fc layer
    optimizer_tl = torch.optim.SGD(
        model.top_layer.parameters(),
        lr=args.lr,
        weight_decay=10**args.wd,
    )

@mathildecaron31 Wouldn't it be better to use a single optimizer with different param_groups? As in Pytorch Docs?

mathildecaron31 commented 4 years ago

Yes, you can definitely do that !