Closed zxt6174 closed 4 months ago
Based on our previous experience in other fields, a larger batch size does not necessarily lead to better performance. In addition, our method uses negative samples from both mini-batch and dynamic quene, therefore, expanding batch size does not necessarily lead to better performance. When doing ablation on one variable, keep the other variable fixed. For example, when changing the queue size, the mini-batch size is set to the optimal value of 128.
为什么,mini-batch size和queue size的消融实验中,这俩变量不是越大越好,或越小越好,比如batchsize在128时表现最好,这是为什么呢?做一个变量的消融时,另一个变量是否固定,固定为多少。