Closed huizhang0110 closed 5 years ago
In the origin paper, the network level update is
But in your code:
# normalized_betas[layer][ith node][0 : ➚, 1: ➙, 2 : ➘] for layer in range (len(self.betas)): if layer == 0: normalized_betas[layer][0][1:] = F.softmax (self.betas[layer][0][1:].to(device=img_device), dim=-1) elif layer == 1: normalized_betas[layer][0][1:] = F.softmax (self.betas[layer][0][1:].to(device=img_device), dim=-1) normalized_betas[layer][1] = F.softmax (self.betas[layer][1].to(device=img_device), dim=-1) elif layer == 2: normalized_betas[layer][0][1:] = F.softmax (self.betas[layer][0][1:].to(device=img_device), dim=-1) normalized_betas[layer][1] = F.softmax (self.betas[layer][1].to(device=img_device), dim=-1) normalized_betas[layer][2] = F.softmax (self.betas[layer][2].to(device=img_device), dim=-1) else : `normalized_betas[layer][0][1:]` = F.softmax (self.betas[layer][0][1:].to(device=img_device), dim=-1) normalized_betas[layer][1] = F.softmax (self.betas[layer][1].to(device=img_device), dim=-1) normalized_betas[layer][2] = F.softmax (self.betas[layer][2].to(device=img_device), dim=-1) normalized_betas[layer][3][:1] = F.softmax (self.betas[layer][3][:1].to(device=img_device), dim=-1)
This section of code is the implementation of beta normalization before further inference like the formula you mentioned.
In the origin paper, the network level update is
But in your code: