haozzh / FedCR

17 stars 0 forks source link

Clarification on the network sharing #4

Open WilliamYi96 opened 1 year ago

WilliamYi96 commented 1 year ago

Hi @haozzh, it would be good if you could clarify the following question:

When CIFAR10 is trained for baselines, we have the following model from Nets.py:

if self.name == 'cifar10_LeNet':
            self.n_cls = 10
            self.conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=5)
            self.conv2 = nn.Conv2d(in_channels=64, out_channels=64, kernel_size=5)
            self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
            self.fc1 = nn.Linear(64 * 5 * 5, 1024)
            self.fc2 = nn.Linear(1024, 1024)
            self.fc3 = nn.Linear(1024, self.n_cls)
            self.weight_keys = [['conv1.weight', 'conv1.bias'],
                                ['conv2.weight', 'conv2.bias'],
                                ['fc1.weight', 'fc1.bias'],
                                ['fc2.weight', 'fc2.bias'],
                                ['fc3.weight', 'fc3.bias']]

However, when running FedRep,

elif 'CIFAR10' in args.dataset:
            w_glob_keys = [net_glob.weight_keys[i] for i in [0, 1, 2, 3]]

Only the first 4 layers are selected to be trained locally. However, the last layer is not trained. Could you please share the logic behind it? And have you done any comparison for that?

haozzh commented 1 year ago

Hello, the final layer will continue to undergo training. The global feature extractor's role here is primarily to determine whether param.requires_grad is set to True or False. This determination is utilized for the successive updating of the global feature extractor and classifier during the local update process. For precise implementation details, please refer to lines 363 - 385 in the distributed_training_utils.py file and consult the paper titled "FedRep" for a more comprehensive explanation of the specific methodology employed.