google-research / mixmatch

Apache License 2.0
1.13k stars 163 forks source link

A question about lambda_u #29

Closed zhaozhengChen closed 4 years ago

zhaozhengChen commented 4 years ago

Hi,

In section 3.5 of your paper:

In all experiments, we linearly ramp up λ_u to its maximum value over the first 16,000 steps of training as is common practice [44].

But the implementation of lambda seems to ramp up lambda_u in 1024 epochs(1024*1024 steps).

david-berthelot commented 4 years ago

In the paper, steps are the number of batches. In the code, the step variable contains the number of images seen so far: https://github.com/google-research/mixmatch/blob/master/libml/train.py#L49

So paper steps = code steps / batch. 1024*1024 / 64 = 16384

miquelmarti commented 3 years ago

Sorry to comment on a closed issue but I feel it's related. The batch is 64 for the labeled samples, but also 64 for the unlabeled samples, that gives a total batch size of 128. I believe you just count labeled samples for the purpose of defining steps and keeping track of the training length, is that correct? Thanks in advance