wjko2 / Domain-Agnostic-Sentence-Specificity-Prediction

13 stars 10 forks source link

Number of epochs #15

Open JoshuaMathias opened 2 years ago

JoshuaMathias commented 2 years ago

I did a review of parameters and variables related to the number of epochs:

  1. n_epochs (5): Does nothing. Sets the variable "gg" but right after that gg is set by ne0 instead. Stands for "number of epochs" for training.
  2. ne0 (100): Used in the inner main training loop to set "gg". Stands for "number of epochs".
  3. se (4): Used in the inner inner training loop in trainepoch(). Once the number of epochs is above this number, "loss3 x c2" is no longer added to the loss. Perhaps this means the number of epochs for "self-ensembling" which is what SE refers to in the paper.
  4. me (31): This is used for the actual number of epochs as it sets stop_training to True which breaks both the inner loop and main and causes the outer loop to do nothing, and it's reached before ne0 which is set to 100. Not sure what this stands for. "m" epochs. Maybe it was used to decide when to store a version of the model during training since there's code commented out that stores the model.
  5. bb (0): This is used to select the function trainepoch instead of trainepochb, but trainepochb doesn't exist.
  6. sss (50): This is used for the number of iterations of the outer main loop.
  7. epoch: This is the variable sent to trainepoch and is what's printed and compared to me. This is set outside the outer loop.
  8. epochh: This is set to 1 within the outer loop and is compared with gg in the inner loop, where gg is set by ne0. Under the default parameters, epochh has no effect.

I also noticed that the number of epochs that are actually run is 31, matching the default for mm.

To avoid confusion, I'm doing the following:

  1. Keeping only the variables n_epochs and se (se_epochs) and setting n_epochs to 31 by default to match the current default.
  2. Retaining only one loop in main.
  3. Only using epoch and not epoch.
  4. Not using stop_training; just checking for epoch in the while loop declaration.
  5. Changing "esize" to None by default and only using a max number of samples if this is set. Also renaming to "epoch_size". I was experimenting with more data and I didn't even realize my data was being cut off.
epoch = 0
while epoch < params.n_epochs:
    train_acc = trainepoch(epoch)
    epoch += 1

Commit with changes: https://github.com/JoshuaMathias/Domain-Agnostic-Sentence-Specificity-Prediction/commit/485f441011441979d2862301d6abc4b5aebb7cfe

JoshuaMathias commented 2 years ago

I noticed that when epoch 4 is reached, the loss increases significantly due to the se parameter starting to take effect. I realized this is likely when self-ensembling starts, not when it finished. So I renamed it to _se_epochstart.