Closed hao-pt closed 6 months ago
Could you share some insights into why you use a fixed coefficient c like this: c = 0.00054 * math.sqrt(math.prod(input.shape[1:])) rather than a specific user's defined value like 0.001 in the paper.
I obtained that specific scaling from table 1.
Thank you, I just found that in the paper.
Could you share some insights into why you use a fixed coefficient c like this: c = 0.00054 * math.sqrt(math.prod(input.shape[1:])) rather than a specific user's defined value like 0.001 in the paper.