https://arxiv.org/abs/2303.13664 shows that varying the temperature in the contrastive loss during pretraining can help models learn better representations for datasets with long-tail distributions and doesn't hurt performance for datasets with uniform distributions. This could be an interesting method to add to the package.
I would start with running a benchmark on a normal dataset (ImageNet) as we don't have standard long-tail datasets in our benchmarks. It would also be interesting to see if it also gives good performance on full ImageNet as the paper only reports results for ImageNet100-LT. If it also works on default ImageNet it could be a good default method to add to most contrastive models.
https://arxiv.org/abs/2303.13664 shows that varying the temperature in the contrastive loss during pretraining can help models learn better representations for datasets with long-tail distributions and doesn't hurt performance for datasets with uniform distributions. This could be an interesting method to add to the package.
I would start with running a benchmark on a normal dataset (ImageNet) as we don't have standard long-tail datasets in our benchmarks. It would also be interesting to see if it also gives good performance on full ImageNet as the paper only reports results for ImageNet100-LT. If it also works on default ImageNet it could be a good default method to add to most contrastive models.
TODO:
cosine_schedule