Closed mengxingkong closed 4 years ago
self.fc = nn.Linear(2048 * block.expansion, 1, bias=False) why the output of fc layer is 1 other than num_classes? logits = logits + self.linear_1_bias one output feature of the fc layer plus num_classes bias?
That's because of the weight sharing explained in the paper. Let me know if that's unclear.