Confused about lm_head in Beit2 model pretrain

Describe Model I am using (beit2/model_pretrain):

On the line 130, the size of x shoule be (batch_size, patch_nums, embed_dim). Turn to line 137, the size after lm_head should be (batch_size, masked_patch_nums, vocab_size)，but here is (batch_size*masked_patch_nums, vocab_size), it seems nn.Linear multiply dimension 0 and dimension 1 to a new dimension. When I check the official document of nn.Linear, it says: all but the last dimension are the same shape as the input and H_out=out_features. My pytorch version is: 2.1.0.dev20230831+cu118. Thanks for the responding!

microsoft / unilm

Confused about lm_head in Beit2 model pretrain #1283