lukemelas / PyTorch-Pretrained-ViT

Vision Transformer (ViT) in PyTorch
770 stars 124 forks source link

self.pos_embedding error training on different dataset #14

Open khawar-islam opened 3 years ago

khawar-islam commented 3 years ago

I am using a pre-trained VIT model and trained on some different task but I got an error

model.py file

class PositionalEmbedding1D(nn.Module):
    """Adds (optionally learned) positional embeddings to the inputs."""

    def __init__(self, seq_len, dim):
        super().__init__()
        self.pos_embedding = nn.Parameter(torch.zeros(1, seq_len, dim))

    def forward(self, x):
        """Input has shape `(batch_size, seq_len, emb_dim)`"""
        return x + self.pos_embedding

Traceback

    result = self.forward(*input, **kwargs)
  File "/media/khawar/HDD_Khawar/n/Pretrained_ViT/pytorch_pretrained_vit/model.py", line 24, in forward
    return x + self.pos_embedding
culiver commented 2 years ago

Same problem!

culiver commented 2 years ago

Same problem!

Solved by assigning the image_size

helloworld-wu commented 2 years ago

Same problem!

Solved by assigning the image_size

Hello, I have the same problem. Could you tell me how to assign the image_size? Thank you!

helloworld-wu commented 2 years ago

@culiver Hello, I have the same problem. Could you tell me how to assign the image_size? Thank you!

culiver commented 2 years ago

@culiver Hello, I have the same problem. Could you tell me how to assign the image_size? Thank you!

I assigned the image size as the below code. model = ViT('B_16_imagenet1k', pretrained=True, num_classes=37, image_size=opt.input_size).cuda()

Note that do not put this line in nn.Sequential or it might not change the input size.

helloworld-wu commented 2 years ago

@culiver Hello, I have the same problem. Could you tell me how to assign the image_size? Thank you!

I assigned the image size as the below code. model = ViT('B_16_imagenet1k', pretrained=True, num_classes=37, image_size=opt.input_size).cuda()

Note that do not put this line in nn.Sequential or it might not change the input size.

Thanks !