NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
12.94k stars 3.12k forks source link

[ELECTRA/TensorFlow2] Minor: README Has Misleading Description Of Warmup #1328

Open psharpe99 opened 11 months ago

psharpe99 commented 11 months ago

Related to ELECTRA/TensorFlow2

Describe the bug The README file includes info about the warmup steps, but it describes it as a percentage whilst also showing a default value that is an integer number of steps rather than a percentage:

README.md:- <warmup_steps_p1> is the percentage of training steps used for warm-up at the start of training. Default is 2000. README.md:- <warmup_steps_p2> is the percentage of training steps used for warm-up at the start of training. Default is 200.

and

--num_warmup_steps NUM_WARMUP_STEPS

This is misleading.

To Reproduce See README.md

Expected behavior The text should be changed to reflect that this is intended to be an integer number of steps. It is also not clear if this is intended to be a number of steps used from the number of training steps. That is, the warmup steps needs to be strictly less than the training steps. For example training steps: 10,000 warmup steps: 2,000 leaving 8,000 steps for actual training, or whether the 2000 warmup steps are performed, followed by 10000 actual training steps.

Environment N/A