Open maxwellzh opened 3 years ago
This is not an official implementation, so there is a slight difference in the number of parameters.
Of course, I tried to implement it as similar as possible to the contents of the paper. :).
Also, num_classes affects.
This is kind of weird. I test several open-source Conformer implementation (I also implement it myself), but none of them can strictly match the reported number of parameters. Do you have any idea where the difference may be?
btw. num_classes
is set to 1k according to the paper.
I'm curious, too. I am only speculating that there may be details not mentioned in the paper.
In the Conformer original paper, the number of parameters are
However, with the implementation in this repo, the number of parameters are slightly different
I get the size with this script
Since the convolution layer kernel size couldn't be set to 32, I just set it to 31. But this won't make such difference in number of params.