Open iliaohai opened 1 month ago
Hi,
This is expected if you set bsz=1. We recommend a larger batch size to make the regularizer work effectively. In our paper, we use bsz=256.
For the mathematical reason behind this, we can refer to Eq.6 and Appendix Eq. 11. During implementation, bsz=1 could result in N=1. This could make the sampling always sample from itself.
Hi, I'm trying to use ANCHORED FEATURE REGULARIZER as you suggested, but I'm having a problem that when Batch_size=1, lowerbound_loss is always 0. After checking the code, it's caused by the following code, can you help me? Thanks.
========print========== used InfoNCE x_shape torch.Size([1, 1024]) tensor([[-1.1419, 0.0000, 0.1771, ..., -2.2108, 0.0000, 0.5778]], device='cuda:0', grad_fn=)
y_samples
torch.Size([1, 1024])
tensor([[-0.1306, -0.1803, -0.0562, ..., -0.0158, -0.0930, -0.0641]],
device='cuda:0', grad_fn=)
T0:
tensor([[0.5524]], device='cuda:0', grad_fn=)
T1:
tensor([[[0.5524]]], device='cuda:0', grad_fn=)
lower_bound:
tensor(0., device='cuda:0', grad_fn=)