LTH14 / rcg

PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
MIT License
785 stars 36 forks source link

About the BatchNorm1d #9

Closed 920232796 closed 8 months ago

920232796 commented 8 months ago

Thank you for your gread work. I have encountered a strange question:

When I run the code on multiple gpus, I don't have any problems.

But when I test on a single gpu, I find an error about the BatchNorm1d:

return torch.batch_norm( RuntimeError: running_mean should contain 197 elements not 4096

def build_mlp(num_layers, input_dim, mlp_dim, output_dim, last_bn=True):
    mlp = []
    for l in range(num_layers):
        dim1 = input_dim if l == 0 else mlp_dim
        dim2 = output_dim if l == num_layers - 1 else mlp_dim

        mlp.append(nn.Linear(dim1, dim2, bias=False))

        if l < num_layers - 1:
            mlp.append(nn.BatchNorm1d(dim2))
            mlp.append(nn.ReLU(inplace=True))
        elif last_bn:
            # follow SimCLR's design: https://github.com/google-research/simclr/blob/master/model_util.py#L157
            # for simplicity, we further removed gamma in BN
            mlp.append(nn.BatchNorm1d(dim2, affine=False))

    return nn.Sequential(*mlp)

If the input shape is [batch, 197, dim2], the input can not be passed into a BatchNorm1d(dim2) layer (dim2=4096).

Could you help me answer this question? Thank you!

LTH14 commented 8 months ago

Which command do you use? I never tested single-gpu scenario, so the current code might not be compatible with that.

920232796 commented 8 months ago

Thank you for your rely!

I don't run this source code but use it on my own codebase and find this problem. And I test the BatchNorm1d function in this way:

import torch 
import torch.nn as nn 

model = nn.BatchNorm1d(16)
t1 = torch.rand(2, 8, 16)

out = model(t1)

print(out.shape)

This code will show this error:

return torch.batch_norm( RuntimeError: running_mean should contain 8 elements not 16

But I feel very strange that It works well on multiple gpus....

LTH14 commented 8 months ago

I just tried to run RCG's code with 1 GPU (still with DDP) and it works fine. Your test code seems to be wrong: the num_features of BatchNorm1d should be 8, if your t1 is with 8 channels.

920232796 commented 8 months ago

But in this code:

def build_mlp(num_layers, input_dim, mlp_dim, output_dim, last_bn=True):
    mlp = []
    for l in range(num_layers):
        dim1 = input_dim if l == 0 else mlp_dim
        dim2 = output_dim if l == num_layers - 1 else mlp_dim

        mlp.append(nn.Linear(dim1, dim2, bias=False))

        if l < num_layers - 1:
            mlp.append(nn.BatchNorm1d(dim2))
            mlp.append(nn.ReLU(inplace=True))
        elif last_bn:
            # follow SimCLR's design: https://github.com/google-research/simclr/blob/master/model_util.py#L157
            # for simplicity, we further removed gamma in BN
            mlp.append(nn.BatchNorm1d(dim2, affine=False))

    return nn.Sequential(*mlp)

The output features of MLP layer and BatchNorm1d layer are all "dim2", why it could work well?

 mlp.append(nn.Linear(dim1, dim2, bias=False))
  if l < num_layers - 1:
      mlp.append(nn.BatchNorm1d(dim2))
LTH14 commented 8 months ago

This is because the input to this mlp is with shape [B, C], where B is batch size and C is the number of channels. It does not have a third dimension.

LTH14 commented 8 months ago
image

Doc from PyTorch BatchNorm1d

920232796 commented 8 months ago

Thank you!

I have print the shape before the feature is input into the "head" layer:

rep = self.pretrained_encoder.forward_features(x_normalized)
            if self.pretrained_enc_withproj:
                print(rep.shape)
                rep = self.pretrained_encoder.head(rep)

I get the output:

Epoch 0: 8%|██████ | 3/40 [00:05<00:49, 1.34s/it, loss=0.962, lr=9.99e-5] torch.Size([12, 197, 768]) torch.Size([12, 197, 768]) torch.Size([12, 197, 768]) torch.Size([12, 197, 768]) torch.Size([12, 197, 768]) torch.Size([12, 197, 768]) torch.Size([12, 197, 768]) torch.Size([12, 197, 768])

I use eight GPU to run this model.

LTH14 commented 8 months ago

Weird -- that depends on your pretrained_encoder's design. My pretrained_encoder's output is torch.Size([32, 768])

920232796 commented 8 months ago

I'm wondering if my timm package version is not consistent with you. My version is

timm 0.9.10

LTH14 commented 8 months ago

That's the problem. Please make sure you use timm 0.3.2.

920232796 commented 8 months ago

OK, I will try it! Thank you very very much!!

920232796 commented 8 months ago

It works! Thank you~~