Adaptive loss threshold

In Appendix G of the paper, authors mentioned that they used an adaptive loss threshold based on [1], and set the loss threshold to 40%. But after I checked the referred paper, I don't believe DOM could exactly use the same thresholding method as therein, since the context is quite different. So could you please elaborate on how this adaptive loss threshold was implemented? My initial guess is that you set the adaptive loss threshold to be 0.4 times the average loss of a mini-batch. Am I correct?

Currently, I'm trying to apply DOM on ImageNet. Since no experiments on ImageNet are presented in the paper, I have no reference to select a loss threshold for it. That's why I turn to this adaptive loss threshold. It would be great if you could provide more information on that.

[1] Berthelot, D., Roelofs, R., Sohn, K., Carlini, N., & Kurakin, A. (2021, October). AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation. In International Conference on Learning Representations.

tmllab / 2024_ICLR_DOM

Adaptive loss threshold #1