Thanks @jakiejj for noticing the issue.
The throttle_secs is set to be 600s which will cause issues for our test run.
The problem is best_checkpoint_copoier and export are not executed since everything finishes less than 600s in run_detext.sh example. Therefore the evaluation at the end is misleading. Note this is not a problem for longer runs (>>600s) but we should fix it.
The problem: eval_log for run_detext.sh if throttle_secs is too large (600 secs previously)
***** Evaluation on dev set during training *****
## Step 1
loss : 1.0715610980987549
Checking checkpoint model.ckpt-1
keeping checkpoint model.ckpt-1 with metric/ndcg@10 = 0.8154648542404175
## Step 10
loss : 0.4477139115333557
Checking checkpoint model.ckpt-10
keeping checkpoint model.ckpt-10 with metric/ndcg@10 = 1.0
removing old checkpoint model.ckpt-1 with metric/ndcg@10 = 0.8154648542404175
***** Training finished. *****
***** Evaluation on test set with best exported model: *****
global_step = 0
loss = 1.4337275
metric/ndcg@10 = 0.7153383
Changing it to 0 in order to start evaluation right away if there's a new checkpoint.
Fixes # (issue)
Type of change
Please delete options that are not relevant.
[x] Bug fix (non-breaking change which fixes an issue)
List all changes
changed throttle_secs to 0 in train
Testing
new eval_log:
***** Evaluation on dev set during training *****
## Step 1
loss : 1.060211181640625
Checking checkpoint model.ckpt-1
keeping checkpoint model.ckpt-1 with metric/ndcg@10 = 0.75
## Step 2
loss : 0.9183305501937866
Checking checkpoint model.ckpt-2
keeping checkpoint model.ckpt-2 with metric/ndcg@10 = 0.8154648542404175
removing old checkpoint model.ckpt-1 with metric/ndcg@10 = 0.75
## Step 3
loss : 0.8105578422546387
Checking checkpoint model.ckpt-3
skipping checkpoint model.ckpt-3 with metric/ndcg@10 = 0.8154648542404175
## Step 4
loss : 0.7126567363739014
Checking checkpoint model.ckpt-4
skipping checkpoint model.ckpt-4 with metric/ndcg@10 = 0.8154648542404175
## Step 5
loss : 0.6146305203437805
Checking checkpoint model.ckpt-5
skipping checkpoint model.ckpt-5 with metric/ndcg@10 = 0.8154648542404175
## Step 6
loss : 0.5369703769683838
Checking checkpoint model.ckpt-6
keeping checkpoint model.ckpt-6 with metric/ndcg@10 = 1.0
removing old checkpoint model.ckpt-2 with metric/ndcg@10 = 0.8154648542404175
## Step 7
loss : 0.492448627948761
Checking checkpoint model.ckpt-7
skipping checkpoint model.ckpt-7 with metric/ndcg@10 = 1.0
## Step 8
loss : 0.46203750371932983
Checking checkpoint model.ckpt-8
skipping checkpoint model.ckpt-8 with metric/ndcg@10 = 1.0
## Step 9
loss : 0.4408116936683655
Checking checkpoint model.ckpt-9
skipping checkpoint model.ckpt-9 with metric/ndcg@10 = 1.0
## Step 10
loss : 0.43111300468444824
Checking checkpoint model.ckpt-10
skipping checkpoint model.ckpt-10 with metric/ndcg@10 = 1.0
***** Training finished. *****
***** Evaluation on test set with best exported model: *****
global_step = 6
loss = 0.5369704
metric/ndcg@10 = 1.0
Test Configuration:
Firmware version:
Hardware:
Toolchain:
SDK:
Checklist
[ ] My code follows the style guidelines of this project
[ ] I have performed a self-review of my own code
[ ] I have commented my code, particularly in hard-to-understand areas
[ ] I have made corresponding changes to the documentation
[ ] My changes generate no new warnings
[ ] I have added tests that prove my fix is effective or that my feature works
[ ] New and existing unit tests pass locally with my changes
[ ] Any dependent changes have been merged and published in downstream modules
Description
Thanks @jakiejj for noticing the issue. The throttle_secs is set to be 600s which will cause issues for our test run.
The problem is
best_checkpoint_copoier
andexport
are not executed since everything finishes less than 600s in run_detext.sh example. Therefore the evaluation at the end is misleading. Note this is not a problem for longer runs (>>600s) but we should fix it.The problem: eval_log for run_detext.sh if throttle_secs is too large (600 secs previously)
Changing it to 0 in order to start evaluation right away if there's a new checkpoint.
Fixes # (issue)
Type of change
Please delete options that are not relevant.
List all changes
changed
throttle_secs
to 0 in trainTesting
new eval_log:
Test Configuration:
Checklist