Open mengcz13 opened 3 years ago
I notice that the code uses CrossEntropyLoss for local training: https://github.com/pliang279/LG-FedAvg/blob/7af0568b2cae88922ebeacc021b1679815092f4e/models/Update.py#L28
And it accepts the log-probabilities as input: https://github.com/pliang279/LG-FedAvg/blob/7af0568b2cae88922ebeacc021b1679815092f4e/models/Update.py#L50
The output of CNN networks is also logsoftmax: https://github.com/pliang279/LG-FedAvg/blob/7af0568b2cae88922ebeacc021b1679815092f4e/models/Nets.py#L104
But according to the doc of PyTorch, CrossEntropyLoss already has logsoftmax inside: https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss
I think the loss should be calculated via NLLLoss instead if used with input after logsoftmax (https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html#torch.nn.NLLLoss).
Or is there any reason why the code uses logsoftmax twice for calculating the loss?
Same issue
I notice that the code uses CrossEntropyLoss for local training: https://github.com/pliang279/LG-FedAvg/blob/7af0568b2cae88922ebeacc021b1679815092f4e/models/Update.py#L28
And it accepts the log-probabilities as input: https://github.com/pliang279/LG-FedAvg/blob/7af0568b2cae88922ebeacc021b1679815092f4e/models/Update.py#L50
The output of CNN networks is also logsoftmax: https://github.com/pliang279/LG-FedAvg/blob/7af0568b2cae88922ebeacc021b1679815092f4e/models/Nets.py#L104
But according to the doc of PyTorch, CrossEntropyLoss already has logsoftmax inside: https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss
I think the loss should be calculated via NLLLoss instead if used with input after logsoftmax (https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html#torch.nn.NLLLoss).
Or is there any reason why the code uses logsoftmax twice for calculating the loss?