Closed baek85 closed 4 years ago
First of all, thank you for showing interest in RKD.
@baek85 Yes, the number of possible triplets for the angle loss increase proportionally to the cubic of the batch size. For my experiments, I set the batch size to 120, which was ok for P40 with 24 GB of GPU memory.
I would rather recommend you to use only the distance loss, If set the batch size to a large value.
we need a pair of training examples in RKD-D and a triplet samples in RKD-A. In paper, you sample all possible tuples in mini-batch. I think the number of tuples are too many in common classification setting.(ex. CIFAR10, ImageNet) How to sample these pair in Image Classification setting?