microsoft / CodeXGLUE

CodeXGLUE
MIT License
1.57k stars 366 forks source link

Code-to-code strict condition for --do_eval #68

Closed djaekim closed 3 years ago

djaekim commented 3 years ago

The code below is found in the link https://github.com/microsoft/CodeXGLUE/blob/main/Code-Code/code-to-code-trans/code/run.py#L384

I was thinking((global_step + 1) %args.eval_steps == 0) will never be reached if bar = range(num_train_optimization_steps*args.train_batch_size//len(train_examples)) is less than eval_steps (i.e., global step + 1 % eval_steps will never evaluate to 0).

Thus, in --do-test I saw that it throw error if you try to use saved model

I was wondering whether I am doing something wrong? Thank you in advance.

guoday commented 3 years ago

Yes, eval_steps should be less than total training steps. Otherwise, you can't get saved model.

djaekim commented 3 years ago

Hi, but even if num_train_optimization_steps is greater than eval_steps, seems like global step may not be so as it is determined by bar = range(num_train_optimization_steps*args.train_batch_size//len(train_examples))

check the following code below

bar = range(num_train_optimization_steps*args.train_batch_size//len(train_examples))
for step in bar:
    if (nb_tr_steps + 1) % args.gradient_accumulation_steps == 0:
        #Update parameters
        optimizer.step()
        optimizer.zero_grad()
        scheduler.step()
        global_step += 1
        eval_flag = True

    if (args.do_eval and ((global_step + 1) %args.eval_steps == 0) and eval_flag):
         // DO EVAL

For bar = (100000*32) // 133874 == 23, global_step is only incremented 23 times


!! UPDATE !!

Seems like I was using some older version of code, where it defines bar = range(num_train_optimization_steps*args.train_batch_size//len(train_examples)) instead of bar = range(num_train_optimization_steps)

Sorry for any confusion I have caused

guoday commented 3 years ago

bar = range(num_train_optimization_steps) in line 355