zhoudaquan / dvit_repo

MIT License
135 stars 23 forks source link

What's the Norm function? #1

Closed iamhankai closed 3 years ago

iamhankai commented 3 years ago

What's the Norm function in Eq.(3)? LayerNorm?

zhoudaquan commented 3 years ago

What's the Norm function in Eq.(3)? LayerNorm?

We use batch norm for the experiments in the paper.

iamhankai commented 3 years ago

Thanks a lot.

iamhankai commented 3 years ago

Another question: for the input BxHxNxN, which dim is BN performed along ?

zhoudaquan commented 3 years ago

Another question: for the input BxHxNxN, which dim is BN performed along ?

Hi, it is applied along the batch dimension.

iamhankai commented 3 years ago

nn.BatchNorm2d(num_features=H) or nn.BatchNorm2d(num_features=N)?

ggjy commented 3 years ago

Good work!What about the initialization of the HxH matrix?Is torch.eye or torch.randn?

zhoudaquan commented 3 years ago

nn.BatchNorm2d(num_features=H) or nn.BatchNorm2d(num_features=N)?

Hi, should be this one: nn.BatchNorm2d(num_features=H)

zhoudaquan commented 3 years ago

Good work!What about the initialization of the HxH matrix?Is torch.eye or torch.randn?

Hi,

Thanks for your interest. This is a good question, I try both the eye init and random init and the results are similar. For the experiments in the paper, the results are based on random init. I am considering add in a set of experiments regarding the initialization into the paper also.

Best regards, Zhou Daquan

iamhankai commented 3 years ago

Thank you very much.