Hello, according to your paper and github page, the parameter size is either 5.23M(deterministic predictor) or 6.03MB(stochastic predictor).
However, the parameter size of the pretrained model far exceeds those.
In the text & latent encoder, there are total 18 convs, each having a weight with shape 192x192x5, total (192x192x5)x18x4 Byte(since a float variable is 4 Byte) = 12.65625 MB.
If we include bias of conv layers, an embedding layer, a projection layer, a duration predictor module, norm layers and a decoder module, the total parameter will be even bigger.
Can you tell me how you calculated the parameter size?
EDIT
I noticed that the given value was the number of parameters, not the size of parameters. Thank you.
Hello, according to your paper and github page, the parameter size is either 5.23M(deterministic predictor) or 6.03MB(stochastic predictor). However, the parameter size of the pretrained model far exceeds those. In the text & latent encoder, there are total 18 convs, each having a weight with shape 192x192x5, total (192x192x5)x18x4 Byte(since a float variable is 4 Byte) = 12.65625 MB. If we include bias of conv layers, an embedding layer, a projection layer, a duration predictor module, norm layers and a decoder module, the total parameter will be even bigger. Can you tell me how you calculated the parameter size?
EDIT I noticed that the given value was the number of parameters, not the size of parameters. Thank you.