Closed djaekim closed 3 years ago
Yes, eval_steps should be less than total training steps. Otherwise, you can't get saved model.
Hi, but even if num_train_optimization_steps
is greater than eval_steps
, seems like global step
may not be so as it is determined by bar = range(num_train_optimization_steps*args.train_batch_size//len(train_examples))
check the following code below
bar = range(num_train_optimization_steps*args.train_batch_size//len(train_examples))
for step in bar:
if (nb_tr_steps + 1) % args.gradient_accumulation_steps == 0:
#Update parameters
optimizer.step()
optimizer.zero_grad()
scheduler.step()
global_step += 1
eval_flag = True
if (args.do_eval and ((global_step + 1) %args.eval_steps == 0) and eval_flag):
// DO EVAL
For bar = (100000*32) // 133874 == 23, global_step is only incremented 23 times
!! UPDATE !!
Seems like I was using some older version of code, where it defines bar = range(num_train_optimization_steps*args.train_batch_size//len(train_examples))
instead of bar = range(num_train_optimization_steps)
Sorry for any confusion I have caused
bar = range(num_train_optimization_steps) in line 355
The code below is found in the link https://github.com/microsoft/CodeXGLUE/blob/main/Code-Code/code-to-code-trans/code/run.py#L384
I was thinking
((global_step + 1) %args.eval_steps == 0)
will never be reached ifbar = range(num_train_optimization_steps*args.train_batch_size//len(train_examples))
is less than eval_steps (i.e., global step + 1 % eval_steps will never evaluate to 0).Thus, in --do-test I saw that it throw error if you try to use saved model
I was wondering whether I am doing something wrong? Thank you in advance.