Open JohnHerry opened 11 months ago
I am reading the code about TextEncoder, in the models.py line: 168
··· x = self.emb(x) * math.sqrt(self.hidden_channels) # [b, t, h] ···
My question is why there is a factor of "math.sqrt(self.hidden_channels)" there? is it a some normalization action? what is the benifit?
I am reading the code about TextEncoder, in the models.py line: 168
··· x = self.emb(x) * math.sqrt(self.hidden_channels) # [b, t, h] ···
My question is why there is a factor of "math.sqrt(self.hidden_channels)" there? is it a some normalization action? what is the benifit?