State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
12.94k
stars
3.12k
forks
source link
[ELECTRA/TensorFlow2] Minor: README Has Misleading Description Of Warmup #1328
Describe the bug
The README file includes info about the warmup steps, but it describes it as a percentage whilst also showing a default value that is an integer number of steps rather than a percentage:
README.md:- <warmup_steps_p1> is the percentage of training steps used for warm-up at the start of training. Default is 2000.
README.md:- <warmup_steps_p2> is the percentage of training steps used for warm-up at the start of training. Default is 200.
and
--num_warmup_steps NUM_WARMUP_STEPS
Number of steps of training to perform linear learning
rate warmup for. For example, 0.1 = 10% of training.
This is misleading.
To Reproduce
See README.md
Expected behavior
The text should be changed to reflect that this is intended to be an integer number of steps.
It is also not clear if this is intended to be a number of steps used from the number of training steps. That is, the warmup steps needs to be strictly less than the training steps.
For example
training steps: 10,000
warmup steps: 2,000
leaving 8,000 steps for actual training, or whether the 2000 warmup steps are performed, followed by 10000 actual training steps.
Related to ELECTRA/TensorFlow2
Describe the bug The README file includes info about the warmup steps, but it describes it as a percentage whilst also showing a default value that is an integer number of steps rather than a percentage:
README.md:-
<warmup_steps_p1>
is the percentage of training steps used for warm-up at the start of training. Default is 2000. README.md:-<warmup_steps_p2>
is the percentage of training steps used for warm-up at the start of training. Default is 200.and
--num_warmup_steps NUM_WARMUP_STEPS
This is misleading.
To Reproduce See README.md
Expected behavior The text should be changed to reflect that this is intended to be an integer number of steps. It is also not clear if this is intended to be a number of steps used from the number of training steps. That is, the warmup steps needs to be strictly less than the training steps. For example training steps: 10,000 warmup steps: 2,000 leaving 8,000 steps for actual training, or whether the 2000 warmup steps are performed, followed by 10000 actual training steps.
Environment N/A