On the line 130, the size of x shoule be (batch_size, patch_nums, embed_dim).
Turn to line 137, the size after lm_head should be (batch_size, masked_patch_nums, vocab_size),but here is (batch_size*masked_patch_nums, vocab_size), it seems nn.Linear multiply dimension 0 and dimension 1 to a new dimension.
When I check the official document of nn.Linear, it says: all but the last dimension are the same shape as the input and H_out=out_features.
My pytorch version is: 2.1.0.dev20230831+cu118.
Thanks for the responding!
Describe Model I am using (beit2/model_pretrain):
On the line 130, the size of x shoule be (batch_size, patch_nums, embed_dim). Turn to line 137, the size after lm_head should be (batch_size, masked_patch_nums, vocab_size),but here is (batch_size*masked_patch_nums, vocab_size), it seems nn.Linear multiply dimension 0 and dimension 1 to a new dimension. When I check the official document of nn.Linear, it says: all but the last dimension are the same shape as the input and H_out=out_features. My pytorch version is: 2.1.0.dev20230831+cu118. Thanks for the responding!