Number of parameters in Transformer

d2l-ai / d2l-en

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

https://D2L.ai

Other

23.92k stars 4.36k forks source link

Number of parameters in Transformer #929

Closed SoojungHong closed 4 years ago

SoojungHong commented 4 years ago

Hi, All

I want to count the number of trainable parameters in Transformer. In PyTorch, there is a function (e.g. encoder.parameters()) to get the number of parameter. But it seems that d2l.EncoderDecoder(TransformerEncoder, TransformerDecoder) doesn't provide this kind of function. The reference model is following Transformer. https://d2l.ai/chapter_attention-mechanisms/transformer.html

Anyone know how to count the parameter number in Transformer?

AnirudhDagar commented 4 years ago

You can try using collect_params() for any network and loop over the params like this:

num_params = 0
for p in net.collect_params().values():
    num_params += np.prod(np.array(p.data().shape))

Not sure if a more concise/easier method exists. But this should work.

SoojungHong commented 4 years ago

Thank you so much. I could count parameters with your advice.

goldmermaid commented 4 years ago

Hey @SoojungHong , could you close this issue if everything is clear?

SoojungHong commented 4 years ago

ok, thanks a lot for your help!