Closed rlan closed 6 years ago
I'm getting zeros for accuracy while the loss is decreasing.
# python train.py --data_dir ./data --logdir ./logs_train_0514_run2 Start training train.py:99: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number datetime.now(), step, loss.data[0], learning_rate, examples_per_sec) => 2018-05-14 10:08:17.556565: step 100, loss = 7.605348, learning_rate = 0.010000 (507.5 examples/sec) => 2018-05-14 10:08:23.371309: step 200, loss = 6.634058, learning_rate = 0.010000 (556.2 examples/sec) => 2018-05-14 10:08:29.204335: step 300, loss = 6.444423, learning_rate = 0.010000 (554.7 examples/sec) => 2018-05-14 10:08:35.037947: step 400, loss = 6.654078, learning_rate = 0.010000 (554.6 examples/sec) => 2018-05-14 10:08:40.876440: step 500, loss = 6.415401, learning_rate = 0.010000 (554.1 examples/sec) => 2018-05-14 10:08:46.724192: step 600, loss = 6.980000, learning_rate = 0.010000 (553.6 examples/sec) => 2018-05-14 10:08:52.578867: step 700, loss = 7.336755, learning_rate = 0.010000 (552.7 examples/sec) => 2018-05-14 10:08:58.457534: step 800, loss = 6.166699, learning_rate = 0.010000 (550.8 examples/sec) => 2018-05-14 10:09:04.360389: step 900, loss = 6.186161, learning_rate = 0.010000 (547.6 examples/sec) => 2018-05-14 10:09:10.669834: step 1000, loss = 6.420802, learning_rate = 0.010000 (512.7 examples/sec) => Evaluating on validation dataset... /notebooks/evaluator.py:16: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead. images, length_labels, digits_labels = (Variable(images.cuda(), volatile=True), ==> accuracy = 0.000000, best accuracy 0.000000 => patience = 99 => 2018-05-14 10:09:31.108286: step 1100, loss = 5.796063, learning_rate = 0.010000 (524.8 examples/sec) => 2018-05-14 10:09:37.420145: step 1200, loss = 5.399920, learning_rate = 0.010000 (512.9 examples/sec) => 2018-05-14 10:09:43.742597: step 1300, loss = 5.895159, learning_rate = 0.010000 (512.1 examples/sec) # python train.py --data_dir ./data --logdir ./logs_train_0514_run2 Start training train.py:99: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number datetime.now(), step, loss.data[0], learning_rate, examples_per_sec) => 2018-05-14 10:08:17.556565: step 100, loss = 7.605348, learning_rate = 0.010000 (507.5 examples/sec) => 2018-05-14 10:08:23.371309: step 200, loss = 6.634058, learning_rate = 0.010000 (556.2 examples/sec) => 2018-05-14 10:08:29.204335: step 300, loss = 6.444423, learning_rate = 0.010000 (554.7 examples/sec) => 2018-05-14 10:08:35.037947: step 400, loss = 6.654078, learning_rate = 0.010000 (554.6 examples/sec) => 2018-05-14 10:08:40.876440: step 500, loss = 6.415401, learning_rate = 0.010000 (554.1 examples/sec) => 2018-05-14 10:08:46.724192: step 600, loss = 6.980000, learning_rate = 0.010000 (553.6 examples/sec) => 2018-05-14 10:08:52.578867: step 700, loss = 7.336755, learning_rate = 0.010000 (552.7 examples/sec) => 2018-05-14 10:08:58.457534: step 800, loss = 6.166699, learning_rate = 0.010000 (550.8 examples/sec) => 2018-05-14 10:09:04.360389: step 900, loss = 6.186161, learning_rate = 0.010000 (547.6 examples/sec) => 2018-05-14 10:09:10.669834: step 1000, loss = 6.420802, learning_rate = 0.010000 (512.7 examples/sec
Any ideas?
Fixed in my fork. I'm running pytorch 0.4.0.
I'm getting zeros for accuracy while the loss is decreasing.
Any ideas?