Closed brandondutra closed 7 years ago
I wonder if thats actually because the checkpoint interval is 1sec, and eval is done after every checkpoint.
ah, increased it and evaluation became more reasonable. Still hard to infer what evaluation did.
Suggested message change:
INFO: Global steps: 1; Evaluation metric: 0.763
INFO: Evaluation Log: Global step: 1; Evaluation metric: 0.763; Evaluation time: xx; Evaluation throughput: xxx; Duration xxx;
When batch size is larger than the eval set, I expect eval to run very quickly and take 1 step. This is not the case. Below I set batch size to 50000. The first first step took 1.35 secs and has a throughput of 26322.5 instances/sec, so the first train step did about 35535.375 instances. This number is close to the number of instances in the train file (32561), but far from the target batch size. I don't think the first train step has 50000 instances.
I'm not worried by this behavior, but the '--batch-size' help string should be updated to say what happens in this case.
Next evaluation is ran. I expect eval to run for 1 step, and be very fast. But it gets called 99 times and takes about 2 seconds per step.
Launching training task... python -m census.train --data-train samples/census/data/train.csv --data-eval samples/census/data/eval.csv --data-schema samples/census/data /schema.yaml --data-metadata samples/census/data/metadata.json --data-features samples/census/features.yaml --log-level-tensorflow ERROR --l og-level INFO --batch-size 50000 --max-steps 500 --checkpoint-interval-secs 1 --hidden-layers:1 200 --hidden-layers:2 100 --hidden-layers:3 20 --job_dir /usr/local/google/home/brandondutra/tensorfx/tout
INFO: Run: 1.40 sec; Steps: 1; Duration: 1 sec; Throughput: 25948.8 instances/sec INFO: Global steps: 1; Evaluation metric: 0.764 INFO: Global steps: 3; Evaluation metric: 0.764 INFO: Global steps: 5; Evaluation metric: 0.764 INFO: Global steps: 7; Evaluation metric: 0.764 INFO: Global steps: 9; Evaluation metric: 0.764 INFO: Global steps: 11; Evaluation metric: 0.764 INFO: Global steps: 13; Evaluation metric: 0.764 INFO: Global steps: 15; Evaluation metric: 0.764 INFO: Global steps: 17; Evaluation metric: 0.764 INFO: Global steps: 19; Evaluation metric: 0.764 INFO: Global steps: 21; Evaluation metric: 0.764 INFO: Global steps: 23; Evaluation metric: 0.764 INFO: Global steps: 25; Evaluation metric: 0.764 INFO: Global steps: 27; Evaluation metric: 0.764 INFO: Global steps: 29; Evaluation metric: 0.764 INFO: Global steps: 31; Evaluation metric: 0.764 INFO: Global steps: 33; Evaluation metric: 0.764 INFO: Global steps: 35; Evaluation metric: 0.764 INFO: Global steps: 37; Evaluation metric: 0.764 INFO: Global steps: 39; Evaluation metric: 0.764 INFO: Global steps: 41; Evaluation metric: 0.764 INFO: Global steps: 43; Evaluation metric: 0.764 INFO: Global steps: 45; Evaluation metric: 0.764 INFO: Global steps: 47; Evaluation metric: 0.764 INFO: Global steps: 49; Evaluation metric: 0.764 INFO: Global steps: 51; Evaluation metric: 0.764 INFO: Global steps: 53; Evaluation metric: 0.764 INFO: Global steps: 55; Evaluation metric: 0.764 INFO: Global steps: 57; Evaluation metric: 0.764 INFO: Global steps: 59; Evaluation metric: 0.764 INFO: Global steps: 61; Evaluation metric: 0.764 INFO: Global steps: 63; Evaluation metric: 0.764 INFO: Global steps: 65; Evaluation metric: 0.764 INFO: Global steps: 67; Evaluation metric: 0.764 INFO: Global steps: 69; Evaluation metric: 0.764 INFO: Global steps: 71; Evaluation metric: 0.764 INFO: Global steps: 73; Evaluation metric: 0.764 INFO: Global steps: 75; Evaluation metric: 0.764 INFO: Global steps: 77; Evaluation metric: 0.764 INFO: Global steps: 79; Evaluation metric: 0.764 INFO: Global steps: 81; Evaluation metric: 0.764 INFO: Global steps: 83; Evaluation metric: 0.764 INFO: Global steps: 85; Evaluation metric: 0.764 INFO: Global steps: 87; Evaluation metric: 0.764 INFO: Global steps: 89; Evaluation metric: 0.764 INFO: Global steps: 91; Evaluation metric: 0.764 INFO: Global steps: 93; Evaluation metric: 0.764 INFO: Global steps: 95; Evaluation metric: 0.764 INFO: Global steps: 97; Evaluation metric: 0.764 INFO: Global steps: 99; Evaluation metric: 0.764 INFO: Run: 2.08 sec; Steps: 100; Duration: 499 sec; Throughput: 10014.4 instances/sec INFO: Global steps: 100; Duration: 499 sec; Throughput: 10019.6 instances/sec; Loss: 0.533 INFO: Global steps: 101; Evaluation metric: 0.764 INFO: Global steps: 103; Evaluation metric: 0.764 ^C