Description

Thanks @jakiejj for noticing the issue. The throttle_secs is set to be 600s which will cause issues for our test run.

The problem is best_checkpoint_copoier and export are not executed since everything finishes less than 600s in run_detext.sh example. Therefore the evaluation at the end is misleading. Note this is not a problem for longer runs (>>600s) but we should fix it.

The problem: eval_log for run_detext.sh if throttle_secs is too large (600 secs previously)

***** Evaluation on dev set during training *****
## Step 1
loss : 1.0715610980987549
Checking checkpoint model.ckpt-1
keeping checkpoint model.ckpt-1 with metric/ndcg@10 = 0.8154648542404175

## Step 10
loss : 0.4477139115333557
Checking checkpoint model.ckpt-10
keeping checkpoint model.ckpt-10 with metric/ndcg@10 = 1.0
removing old checkpoint model.ckpt-1 with metric/ndcg@10 = 0.8154648542404175

***** Training finished. *****

***** Evaluation on test set with best exported model: *****
global_step = 0
loss = 1.4337275
metric/ndcg@10 = 0.7153383

Changing it to 0 in order to start evaluation right away if there's a new checkpoint.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

[x] Bug fix (non-breaking change which fixes an issue)

List all changes

changed throttle_secs to 0 in train

Testing

new eval_log:

***** Evaluation on dev set during training *****
## Step 1
loss : 1.060211181640625
Checking checkpoint model.ckpt-1
keeping checkpoint model.ckpt-1 with metric/ndcg@10 = 0.75

## Step 2
loss : 0.9183305501937866
Checking checkpoint model.ckpt-2
keeping checkpoint model.ckpt-2 with metric/ndcg@10 = 0.8154648542404175
removing old checkpoint model.ckpt-1 with metric/ndcg@10 = 0.75

## Step 3
loss : 0.8105578422546387
Checking checkpoint model.ckpt-3
skipping checkpoint model.ckpt-3 with metric/ndcg@10 = 0.8154648542404175

## Step 4
loss : 0.7126567363739014
Checking checkpoint model.ckpt-4
skipping checkpoint model.ckpt-4 with metric/ndcg@10 = 0.8154648542404175

## Step 5
loss : 0.6146305203437805
Checking checkpoint model.ckpt-5
skipping checkpoint model.ckpt-5 with metric/ndcg@10 = 0.8154648542404175

## Step 6
loss : 0.5369703769683838
Checking checkpoint model.ckpt-6
keeping checkpoint model.ckpt-6 with metric/ndcg@10 = 1.0
removing old checkpoint model.ckpt-2 with metric/ndcg@10 = 0.8154648542404175

## Step 7
loss : 0.492448627948761
Checking checkpoint model.ckpt-7
skipping checkpoint model.ckpt-7 with metric/ndcg@10 = 1.0

## Step 8
loss : 0.46203750371932983
Checking checkpoint model.ckpt-8
skipping checkpoint model.ckpt-8 with metric/ndcg@10 = 1.0

## Step 9
loss : 0.4408116936683655
Checking checkpoint model.ckpt-9
skipping checkpoint model.ckpt-9 with metric/ndcg@10 = 1.0

## Step 10
loss : 0.43111300468444824
Checking checkpoint model.ckpt-10
skipping checkpoint model.ckpt-10 with metric/ndcg@10 = 1.0

***** Training finished. *****

***** Evaluation on test set with best exported model: *****
global_step = 6
loss = 0.5369704
metric/ndcg@10 = 1.0

Test Configuration:

Firmware version:
Hardware:
Toolchain:
SDK:

Checklist

[ ] My code follows the style guidelines of this project
[ ] I have performed a self-review of my own code
[ ] I have commented my code, particularly in hard-to-understand areas
[ ] I have made corresponding changes to the documentation
[ ] My changes generate no new warnings
[ ] I have added tests that prove my fix is effective or that my feature works
[ ] New and existing unit tests pass locally with my changes
[ ] Any dependent changes have been merged and published in downstream modules

linkedin / detext

Changing throttle_secs to be 0 to fix small data eval issue. #27

Description

Type of change

List all changes

Testing

Checklist