VisualComputingInstitute / triplet-reid

Code for reproducing the results of our "In Defense of the Triplet Loss for Person Re-Identification" paper.
https://arxiv.org/abs/1703.07737
MIT License
764 stars 216 forks source link

Loss doesn't go down lower than 0.7 for market1501 dataset #91

Open mazatov opened 4 years ago

mazatov commented 4 years ago

Trying to train on market1501 as a proof of concept before modifying anything. The training loss quickly foes down to 0.7 and stays there forever. So far haven't changed anything in the script. Just changed the batch_p = 16 , as I was running out of memory on my comptuer. Any ideas on what I might be not doing right?

Colocations handled automatically by placer.
2020-02-25 16:08:37,651 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2020-02-25 16:08:37,651 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2020-02-25 16:08:37,665 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
2020-02-25 16:08:37,665 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
2020-02-25 16:08:46,499 [INFO] tensorflow: experiments\market_train\checkpoint-0 is not in all_model_checkpoint_paths. Manually adding it.
2020-02-25 16:08:46,499 [INFO] tensorflow: experiments\market_train\checkpoint-0 is not in all_model_checkpoint_paths. Manually adding it.
2020-02-25 16:09:02,141 [INFO] train: Starting training from iteration 0.
2020-02-25 16:09:09,892 [INFO] train: iter:     1, loss min|avg|max: 0.865|4.333|17.520, batch-p@3: 8.33%, ETA: 2 days, 5:48:35 (7.75s/it)
2020-02-25 16:09:10,247 [INFO] train: iter:     2, loss min|avg|max: 0.872|2.155|11.571, batch-p@3: 13.02%, ETA: 2:25:25 (0.35s/it)
2020-02-25 16:09:10,604 [INFO] train: iter:     3, loss min|avg|max: 0.886|2.971|11.878, batch-p@3: 6.25%, ETA: 2:27:55 (0.36s/it)
2020-02-25 16:09:10,963 [INFO] train: iter:     4, loss min|avg|max: 0.878|1.741| 5.409, batch-p@3: 11.46%, ETA: 2:28:15 (0.36s/it)
2020-02-25 16:09:11,322 [INFO] train: iter:     5, loss min|avg|max: 0.803|1.908| 6.864, batch-p@3: 12.50%, ETA: 2:28:21 (0.36s/it)
2020-02-25 16:09:11,678 [INFO] train: iter:     6, loss min|avg|max: 0.820|1.692| 5.620, batch-p@3: 9.38%, ETA: 2:27:01 (0.35s/it)
2020-02-25 16:09:12,030 [INFO] train: iter:     7, loss min|avg|max: 0.854|1.858| 8.326, batch-p@3: 7.29%, ETA: 2:25:25 (0.35s/it)
2020-02-25 16:09:12,385 [INFO] train: iter:     8, loss min|avg|max: 0.845|1.461| 3.330, batch-p@3: 9.90%, ETA: 2:26:38 (0.35s/it)
2020-02-25 16:09:12,744 [INFO] train: iter:     9, loss min|avg|max: 0.802|1.675| 8.456, batch-p@3: 5.21%, ETA: 2:28:42 (0.36s/it)
2020-02-25 16:09:13,104 [INFO] train: iter:    10, loss min|avg|max: 0.798|1.475| 3.716, batch-p@3: 7.29%, ETA: 2:29:01 (0.36s/it)
2020-02-25 16:09:13,455 [INFO] train: iter:    11, loss min|avg|max: 0.764|1.357| 2.830, batch-p@3: 10.94%, ETA: 2:25:00 (0.35s/it)
2020-02-25 16:09:13,817 [INFO] train: iter:    12, loss min|avg|max: 0.786|1.250| 3.432, batch-p@3: 9.90%, ETA: 2:28:41 (0.36s/it)
2020-02-25 16:09:14,172 [INFO] train: iter:    13, loss min|avg|max: 0.668|1.259| 2.927, batch-p@3: 7.81%, ETA: 2:26:36 (0.35s/it)
2020-02-25 16:09:14,519 [INFO] train: iter:    14, loss min|avg|max: 0.643|1.094| 4.106, batch-p@3: 18.75%, ETA: 2:23:16 (0.34s/it)
2020-02-25 16:09:14,878 [INFO] train: iter:    15, loss min|avg|max: 0.785|1.343| 4.559, batch-p@3: 15.10%, ETA: 2:28:03 (0.36s/it)
2020-02-25 16:09:15,234 [INFO] train: iter:    16, loss min|avg|max: 0.731|1.216| 4.446, batch-p@3: 11.98%, ETA: 2:26:59 (0.35s/it)
2020-02-25 16:09:15,592 [INFO] train: iter:    17, loss min|avg|max: 0.766|1.130| 4.649, batch-p@3: 10.94%, ETA: 2:28:17 (0.36s/it)
2020-02-25 16:09:15,947 [INFO] train: iter:    18, loss min|avg|max: 0.748|1.143| 2.784, batch-p@3: 14.06%, ETA: 2:26:52 (0.35s/it)
2020-02-25 16:09:16,304 [INFO] train: iter:    19, loss min|avg|max: 0.704|1.067| 2.621, batch-p@3: 9.38%, ETA: 2:27:26 (0.35s/it)
2020-02-25 16:09:16,690 [INFO] train: iter:    20, loss min|avg|max: 0.783|1.122| 2.887, batch-p@3: 8.85%, ETA: 2:38:55 (0.38s/it)
2020-02-25 16:09:17,052 [INFO] train: iter:    21, loss min|avg|max: 0.754|1.062| 3.799, batch-p@3: 8.33%, ETA: 2:29:52 (0.36s/it)
2020-02-25 16:09:17,414 [INFO] train: iter:    22, loss min|avg|max: 0.748|1.123| 1.990, batch-p@3: 10.94%, ETA: 2:29:28 (0.36s/it)
2020-02-25 16:09:17,795 [INFO] train: iter:    23, loss min|avg|max: 0.736|0.985| 1.747, batch-p@3: 11.46%, ETA: 2:37:45 (0.38s/it)
2020-02-25 16:09:18,155 [INFO] train: iter:    24, loss min|avg|max: 0.742|1.086| 6.032, batch-p@3: 11.98%, ETA: 2:28:37 (0.36s/it)
2020-02-25 16:09:18,518 [INFO] train: iter:    25, loss min|avg|max: 0.719|1.022| 1.805, batch-p@3: 9.90%, ETA: 2:29:44 (0.36s/it)
2020-02-25 16:09:18,871 [INFO] train: iter:    26, loss min|avg|max: 0.741|1.071| 2.763, batch-p@3: 10.94%, ETA: 2:26:08 (0.35s/it)
2020-02-25 16:09:19,227 [INFO] train: iter:    27, loss min|avg|max: 0.715|0.953| 2.764, batch-p@3: 11.46%, ETA: 2:26:58 (0.35s/it)
2020-02-25 16:09:19,585 [INFO] train: iter:    28, loss min|avg|max: 0.711|0.932| 2.323, batch-p@3: 11.98%, ETA: 2:27:46 (0.36s/it)
2020-02-25 16:09:19,941 [INFO] train: iter:    29, loss min|avg|max: 0.740|1.007| 2.782, batch-p@3: 8.85%, ETA: 2:26:56 (0.35s/it)
2020-02-25 16:09:20,295 [INFO] train: iter:    30, loss min|avg|max: 0.736|0.973| 3.527, batch-p@3: 16.15%, ETA: 2:26:17 (0.35s/it)
2020-02-25 16:09:20,651 [INFO] train: iter:    31, loss min|avg|max: 0.720|0.993| 2.995, batch-p@3: 16.15%, ETA: 2:26:59 (0.35s/it)
2020-02-25 16:09:21,009 [INFO] train: iter:    32, loss min|avg|max: 0.728|1.068| 3.389, batch-p@3: 14.58%, ETA: 2:27:42 (0.35s/it)
2020-02-25 16:09:21,364 [INFO] train: iter:    33, loss min|avg|max: 0.735|0.901| 1.411, batch-p@3: 13.02%, ETA: 2:27:19 (0.35s/it)
2020-02-25 16:09:21,719 [INFO] train: iter:    34, loss min|avg|max: 0.728|0.884| 1.307, batch-p@3: 6.77%, ETA: 2:26:16 (0.35s/it)
2020-02-25 16:09:22,080 [INFO] train: iter:    35, loss min|avg|max: 0.718|0.905| 1.174, batch-p@3: 8.85%, ETA: 2:28:59 (0.36s/it)
2020-02-25 16:09:22,439 [INFO] train: iter:    36, loss min|avg|max: 0.720|0.931| 1.618, batch-p@3: 12.50%, ETA: 2:27:29 (0.35s/it)
2020-02-25 16:09:22,795 [INFO] train: iter:    37, loss min|avg|max: 0.731|0.982| 3.842, batch-p@3: 12.50%, ETA: 2:27:17 (0.35s/it)
2020-02-25 16:09:23,154 [INFO] train: iter:    38, loss min|avg|max: 0.734|0.879| 2.909, batch-p@3: 14.06%, ETA: 2:28:08 (0.36s/it)
2020-02-25 16:09:23,507 [INFO] train: iter:    39, loss min|avg|max: 0.699|0.912| 2.311, batch-p@3: 15.10%, ETA: 2:25:40 (0.35s/it)
2020-02-25 16:09:23,884 [INFO] train: iter:    40, loss min|avg|max: 0.697|0.910| 1.829, batch-p@3: 15.62%, ETA: 2:35:33 (0.37s/it)
2020-02-25 16:09:24,287 [INFO] train: iter:    41, loss min|avg|max: 0.721|0.864| 1.853, batch-p@3: 16.67%, ETA: 2:46:49 (0.40s/it)
2020-02-25 16:09:24,664 [INFO] train: iter:    42, loss min|avg|max: 0.729|0.861| 1.211, batch-p@3: 11.46%, ETA: 2:35:34 (0.37s/it)
2020-02-25 16:09:25,047 [INFO] train: iter:    43, loss min|avg|max: 0.727|0.911| 1.790, batch-p@3: 8.33%, ETA: 2:38:28 (0.38s/it)
2020-02-25 16:09:25,417 [INFO] train: iter:    44, loss min|avg|max: 0.715|0.839| 1.201, batch-p@3: 17.19%, ETA: 2:33:04 (0.37s/it)
2020-02-25 16:09:25,793 [INFO] train: iter:    45, loss min|avg|max: 0.755|0.902| 1.494, batch-p@3: 10.94%, ETA: 2:35:27 (0.37s/it)
2020-02-25 16:09:26,165 [INFO] train: iter:    46, loss min|avg|max: 0.700|0.866| 1.410, batch-p@3: 11.46%, ETA: 2:34:07 (0.37s/it)
2020-02-25 16:09:26,531 [INFO] train: iter:    47, loss min|avg|max: 0.714|0.811| 1.563, batch-p@3: 12.50%, ETA: 2:31:06 (0.36s/it)
2020-02-25 16:09:26,892 [INFO] train: iter:    48, loss min|avg|max: 0.650|0.792| 1.180, batch-p@3: 13.54%, ETA: 2:28:54 (0.36s/it)
2020-02-25 16:09:27,245 [INFO] train: iter:    49, loss min|avg|max: 0.685|0.848| 1.405, batch-p@3: 11.98%, ETA: 2:25:37 (0.35s/it)
2020-02-25 16:09:27,605 [INFO] train: iter:    50, loss min|avg|max: 0.710|0.901| 1.537, batch-p@3: 10.42%, ETA: 2:28:03 (0.36s/it)
2020-02-25 16:09:27,960 [INFO] train: iter:    51, loss min|avg|max: 0.720|0.840| 1.334, batch-p@3: 11.46%, ETA: 2:26:00 (0.35s/it)
2020-02-25 16:09:28,316 [INFO] train: iter:    52, loss min|avg|max: 0.705|0.815| 1.043, batch-p@3: 18.23%, ETA: 2:27:15 (0.35s/it)
2020-02-25 16:09:28,674 [INFO] train: iter:    53, loss min|avg|max: 0.711|0.847| 1.636, batch-p@3: 14.58%, ETA: 2:27:22 (0.35s/it)
2020-02-25 16:09:29,034 [INFO] train: iter:    54, loss min|avg|max: 0.733|0.883| 1.860, batch-p@3: 4.69%, ETA: 2:28:26 (0.36s/it)
2020-02-25 16:09:29,405 [INFO] train: iter:    55, loss min|avg|max: 0.704|0.830| 1.334, batch-p@3: 11.46%, ETA: 2:33:08 (0.37s/it)
2020-02-25 16:09:29,771 [INFO] train: iter:    56, loss min|avg|max: 0.734|0.841| 1.506, batch-p@3: 9.90%, ETA: 2:30:49 (0.36s/it)
2020-02-25 16:09:30,126 [INFO] train: iter:    57, loss min|avg|max: 0.707|0.845| 1.865, batch-p@3: 7.81%, ETA: 2:26:48 (0.35s/it)
2020-02-25 16:09:30,497 [INFO] train: iter:    58, loss min|avg|max: 0.718|0.854| 1.185, batch-p@3: 11.98%, ETA: 2:32:59 (0.37s/it)
2020-02-25 16:09:30,863 [INFO] train: iter:    59, loss min|avg|max: 0.718|0.800| 1.581, batch-p@3: 8.85%, ETA: 2:30:54 (0.36s/it)
2020-02-25 16:09:31,252 [INFO] train: iter:    60, loss min|avg|max: 0.728|0.820| 1.343, batch-p@3: 7.29%, ETA: 2:33:10 (0.37s/it)
2020-02-25 16:09:31,607 [INFO] train: iter:    61, loss min|avg|max: 0.725|0.796| 1.221, batch-p@3: 7.81%, ETA: 2:27:21 (0.35s/it)
2020-02-25 16:09:31,963 [INFO] train: iter:    62, loss min|avg|max: 0.704|0.768| 1.018, batch-p@3: 13.02%, ETA: 2:27:07 (0.35s/it)
2020-02-25 16:09:32,317 [INFO] train: iter:    63, loss min|avg|max: 0.686|0.807| 1.369, batch-p@3: 16.15%, ETA: 2:26:21 (0.35s/it)
2020-02-25 16:09:32,675 [INFO] train: iter:    64, loss min|avg|max: 0.724|0.827| 1.182, batch-p@3: 6.25%, ETA: 2:27:33 (0.36s/it)
2020-02-25 16:09:33,036 [INFO] train: iter:    65, loss min|avg|max: 0.719|0.785| 1.116, batch-p@3: 8.85%, ETA: 2:29:19 (0.36s/it)
2020-02-25 16:09:33,414 [INFO] train: iter:    66, loss min|avg|max: 0.712|0.801| 1.183, batch-p@3: 11.98%, ETA: 2:36:10 (0.38s/it)
2020-02-25 16:09:33,814 [INFO] train: iter:    67, loss min|avg|max: 0.723|0.800| 1.365, batch-p@3: 8.85%, ETA: 2:45:05 (0.40s/it)
2020-02-25 16:09:34,181 [INFO] train: iter:    68, loss min|avg|max: 0.703|0.781| 1.249, batch-p@3: 12.50%, ETA: 2:31:15 (0.36s/it)
2020-02-25 16:09:34,551 [INFO] train: iter:    69, loss min|avg|max: 0.711|0.800| 1.218, batch-p@3: 13.54%, ETA: 2:32:55 (0.37s/it)
2020-02-25 16:09:34,920 [INFO] train: iter:    70, loss min|avg|max: 0.722|0.809| 1.138, batch-p@3: 11.98%, ETA: 2:32:59 (0.37s/it)
2020-02-25 16:09:35,296 [INFO] train: iter:    71, loss min|avg|max: 0.716|0.796| 1.108, batch-p@3: 10.94%, ETA: 2:35:18 (0.37s/it)
2020-02-25 16:09:35,665 [INFO] train: iter:    72, loss min|avg|max: 0.688|0.784| 1.115, batch-p@3: 16.15%, ETA: 2:32:34 (0.37s/it)
2020-02-25 16:09:36,024 [INFO] train: iter:    73, loss min|avg|max: 0.719|0.800| 1.315, batch-p@3: 10.94%, ETA: 2:28:35 (0.36s/it)
2020-02-25 16:09:36,380 [INFO] train: iter:    74, loss min|avg|max: 0.714|0.792| 1.027, batch-p@3: 14.06%, ETA: 2:27:07 (0.35s/it)
2020-02-25 16:09:36,751 [INFO] train: iter:    75, loss min|avg|max: 0.710|0.778| 1.118, batch-p@3: 9.90%, ETA: 2:33:15 (0.37s/it)
2020-02-25 16:09:37,138 [INFO] train: iter:    76, loss min|avg|max: 0.699|0.776| 1.313, batch-p@3: 11.98%, ETA: 2:40:23 (0.39s/it)
2020-02-25 16:09:37,504 [INFO] train: iter:    77, loss min|avg|max: 0.721|0.809| 1.144, batch-p@3: 12.50%, ETA: 2:30:51 (0.36s/it)
2020-02-25 16:09:37,866 [INFO] train: iter:    78, loss min|avg|max: 0.729|0.801| 0.942, batch-p@3: 11.46%, ETA: 2:29:15 (0.36s/it)
2020-02-25 16:09:38,225 [INFO] train: iter:    79, loss min|avg|max: 0.697|0.782| 0.967, batch-p@3: 15.10%, ETA: 2:28:17 (0.36s/it)
2020-02-25 16:09:38,586 [INFO] train: iter:    80, loss min|avg|max: 0.710|0.815| 1.793, batch-p@3: 9.38%, ETA: 2:29:01 (0.36s/it)
2020-02-25 16:09:38,941 [INFO] train: iter:    81, loss min|avg|max: 0.707|0.826| 1.555, batch-p@3: 10.94%, ETA: 2:26:38 (0.35s/it)

And lots of iterations later:

2020-02-25 19:33:32,859 [INFO] train: iter:  8898, loss min|avg|max: 0.415|0.687| 1.047, batch-p@3: 56.25%, ETA: 1:37:59 (0.37s/it)
2020-02-25 19:33:33,222 [INFO] train: iter:  8899, loss min|avg|max: 0.656|0.787| 1.747, batch-p@3: 55.73%, ETA: 1:36:35 (0.36s/it)
2020-02-25 19:33:33,579 [INFO] train: iter:  8900, loss min|avg|max: 0.691|0.737| 1.346, batch-p@3: 54.17%, ETA: 1:34:59 (0.35s/it)
2020-02-25 19:33:33,940 [INFO] train: iter:  8901, loss min|avg|max: 0.691|0.826| 2.345, batch-p@3: 42.19%, ETA: 1:36:04 (0.36s/it)
2020-02-25 19:33:34,302 [INFO] train: iter:  8902, loss min|avg|max: 0.638|0.717| 0.965, batch-p@3: 58.33%, ETA: 1:36:22 (0.36s/it)
2020-02-25 19:33:34,670 [INFO] train: iter:  8903, loss min|avg|max: 0.353|0.677| 0.728, batch-p@3: 53.65%, ETA: 1:37:55 (0.36s/it)
2020-02-25 19:33:35,031 [INFO] train: iter:  8904, loss min|avg|max: 0.691|0.745| 1.200, batch-p@3: 54.69%, ETA: 1:36:09 (0.36s/it)
2020-02-25 19:33:35,397 [INFO] train: iter:  8905, loss min|avg|max: 0.693|0.777| 3.160, batch-p@3: 43.23%, ETA: 1:37:22 (0.36s/it)
2020-02-25 19:33:35,757 [INFO] train: iter:  8906, loss min|avg|max: 0.692|0.925| 4.062, batch-p@3: 37.50%, ETA: 1:36:02 (0.36s/it)
2020-02-25 19:33:36,120 [INFO] train: iter:  8907, loss min|avg|max: 0.692|0.804| 1.711, batch-p@3: 46.88%, ETA: 1:36:18 (0.36s/it)
2020-02-25 19:33:36,483 [INFO] train: iter:  8908, loss min|avg|max: 0.378|0.677| 0.697, batch-p@3: 72.40%, ETA: 1:36:34 (0.36s/it)
2020-02-25 19:33:36,878 [INFO] train: iter:  8909, loss min|avg|max: 0.693|0.744| 3.838, batch-p@3: 57.81%, ETA: 1:45:06 (0.39s/it)
2020-02-25 19:33:37,249 [INFO] train: iter:  8910, loss min|avg|max: 0.689|0.846| 3.782, batch-p@3: 59.38%, ETA: 1:38:25 (0.37s/it)
Pandoro commented 4 years ago

This shouldn't be the case. My best guess would be that you did not load a pretrained network? I just checked some of my logs and the average loss typically doesn't go over 0.7 after a few hundred iterations. Not loading the correct checkpoint is the only thing I could think of right now.

On Tue, Feb 25, 2020 at 11:46 PM Mike Azatov notifications@github.com wrote:

Trying to train on market1501 as a proof of concept before modifying anything. The training loss quickly foes down to 0.7 and stays there forever. So far haven't changed anything in the script. Just changed the batch_p = 16 , as I was running out of memory on my comptuer. Any ideas on what I might be not doing right?

Colocations handled automatically by placer. 2020-02-25 16:08:37,651 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. 2020-02-25 16:08:37,651 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. 2020-02-25 16:08:37,665 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Deprecated in favor of operator or tf.math.divide. 2020-02-25 16:08:37,665 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Deprecated in favor of operator or tf.math.divide. 2020-02-25 16:08:46,499 [INFO] tensorflow: experiments\market_train\checkpoint-0 is not in all_model_checkpoint_paths. Manually adding it. 2020-02-25 16:08:46,499 [INFO] tensorflow: experiments\market_train\checkpoint-0 is not in all_model_checkpoint_paths. Manually adding it. 2020-02-25 16:09:02,141 [INFO] train: Starting training from iteration 0. 2020-02-25 16:09:09,892 [INFO] train: iter: 1, loss min|avg|max: 0.865|4.333|17.520, batch-p@3: 8.33%, ETA: 2 days, 5:48:35 (7.75s/it) 2020-02-25 16:09:10,247 [INFO] train: iter: 2, loss min|avg|max: 0.872|2.155|11.571, batch-p@3: 13.02%, ETA: 2:25:25 (0.35s/it) 2020-02-25 16:09:10,604 [INFO] train: iter: 3, loss min|avg|max: 0.886|2.971|11.878, batch-p@3: 6.25%, ETA: 2:27:55 (0.36s/it) 2020-02-25 16:09:10,963 [INFO] train: iter: 4, loss min|avg|max: 0.878|1.741| 5.409, batch-p@3: 11.46%, ETA: 2:28:15 (0.36s/it) 2020-02-25 16:09:11,322 [INFO] train: iter: 5, loss min|avg|max: 0.803|1.908| 6.864, batch-p@3: 12.50%, ETA: 2:28:21 (0.36s/it) 2020-02-25 16:09:11,678 [INFO] train: iter: 6, loss min|avg|max: 0.820|1.692| 5.620, batch-p@3: 9.38%, ETA: 2:27:01 (0.35s/it) 2020-02-25 16:09:12,030 [INFO] train: iter: 7, loss min|avg|max: 0.854|1.858| 8.326, batch-p@3: 7.29%, ETA: 2:25:25 (0.35s/it) 2020-02-25 16:09:12,385 [INFO] train: iter: 8, loss min|avg|max: 0.845|1.461| 3.330, batch-p@3: 9.90%, ETA: 2:26:38 (0.35s/it) 2020-02-25 16:09:12,744 [INFO] train: iter: 9, loss min|avg|max: 0.802|1.675| 8.456, batch-p@3: 5.21%, ETA: 2:28:42 (0.36s/it) 2020-02-25 16:09:13,104 [INFO] train: iter: 10, loss min|avg|max: 0.798|1.475| 3.716, batch-p@3: 7.29%, ETA: 2:29:01 (0.36s/it) 2020-02-25 16:09:13,455 [INFO] train: iter: 11, loss min|avg|max: 0.764|1.357| 2.830, batch-p@3: 10.94%, ETA: 2:25:00 (0.35s/it) 2020-02-25 16:09:13,817 [INFO] train: iter: 12, loss min|avg|max: 0.786|1.250| 3.432, batch-p@3: 9.90%, ETA: 2:28:41 (0.36s/it) 2020-02-25 16:09:14,172 [INFO] train: iter: 13, loss min|avg|max: 0.668|1.259| 2.927, batch-p@3: 7.81%, ETA: 2:26:36 (0.35s/it) 2020-02-25 16:09:14,519 [INFO] train: iter: 14, loss min|avg|max: 0.643|1.094| 4.106, batch-p@3: 18.75%, ETA: 2:23:16 (0.34s/it) 2020-02-25 16:09:14,878 [INFO] train: iter: 15, loss min|avg|max: 0.785|1.343| 4.559, batch-p@3: 15.10%, ETA: 2:28:03 (0.36s/it) 2020-02-25 16:09:15,234 [INFO] train: iter: 16, loss min|avg|max: 0.731|1.216| 4.446, batch-p@3: 11.98%, ETA: 2:26:59 (0.35s/it) 2020-02-25 16:09:15,592 [INFO] train: iter: 17, loss min|avg|max: 0.766|1.130| 4.649, batch-p@3: 10.94%, ETA: 2:28:17 (0.36s/it) 2020-02-25 16:09:15,947 [INFO] train: iter: 18, loss min|avg|max: 0.748|1.143| 2.784, batch-p@3: 14.06%, ETA: 2:26:52 (0.35s/it) 2020-02-25 16:09:16,304 [INFO] train: iter: 19, loss min|avg|max: 0.704|1.067| 2.621, batch-p@3: 9.38%, ETA: 2:27:26 (0.35s/it) 2020-02-25 16:09:16,690 [INFO] train: iter: 20, loss min|avg|max: 0.783|1.122| 2.887, batch-p@3: 8.85%, ETA: 2:38:55 (0.38s/it) 2020-02-25 16:09:17,052 [INFO] train: iter: 21, loss min|avg|max: 0.754|1.062| 3.799, batch-p@3: 8.33%, ETA: 2:29:52 (0.36s/it) 2020-02-25 16:09:17,414 [INFO] train: iter: 22, loss min|avg|max: 0.748|1.123| 1.990, batch-p@3: 10.94%, ETA: 2:29:28 (0.36s/it) 2020-02-25 16:09:17,795 [INFO] train: iter: 23, loss min|avg|max: 0.736|0.985| 1.747, batch-p@3: 11.46%, ETA: 2:37:45 (0.38s/it) 2020-02-25 16:09:18,155 [INFO] train: iter: 24, loss min|avg|max: 0.742|1.086| 6.032, batch-p@3: 11.98%, ETA: 2:28:37 (0.36s/it) 2020-02-25 16:09:18,518 [INFO] train: iter: 25, loss min|avg|max: 0.719|1.022| 1.805, batch-p@3: 9.90%, ETA: 2:29:44 (0.36s/it) 2020-02-25 16:09:18,871 [INFO] train: iter: 26, loss min|avg|max: 0.741|1.071| 2.763, batch-p@3: 10.94%, ETA: 2:26:08 (0.35s/it) 2020-02-25 16:09:19,227 [INFO] train: iter: 27, loss min|avg|max: 0.715|0.953| 2.764, batch-p@3: 11.46%, ETA: 2:26:58 (0.35s/it) 2020-02-25 16:09:19,585 [INFO] train: iter: 28, loss min|avg|max: 0.711|0.932| 2.323, batch-p@3: 11.98%, ETA: 2:27:46 (0.36s/it) 2020-02-25 16:09:19,941 [INFO] train: iter: 29, loss min|avg|max: 0.740|1.007| 2.782, batch-p@3: 8.85%, ETA: 2:26:56 (0.35s/it) 2020-02-25 16:09:20,295 [INFO] train: iter: 30, loss min|avg|max: 0.736|0.973| 3.527, batch-p@3: 16.15%, ETA: 2:26:17 (0.35s/it) 2020-02-25 16:09:20,651 [INFO] train: iter: 31, loss min|avg|max: 0.720|0.993| 2.995, batch-p@3: 16.15%, ETA: 2:26:59 (0.35s/it) 2020-02-25 16:09:21,009 [INFO] train: iter: 32, loss min|avg|max: 0.728|1.068| 3.389, batch-p@3: 14.58%, ETA: 2:27:42 (0.35s/it) 2020-02-25 16:09:21,364 [INFO] train: iter: 33, loss min|avg|max: 0.735|0.901| 1.411, batch-p@3: 13.02%, ETA: 2:27:19 (0.35s/it) 2020-02-25 16:09:21,719 [INFO] train: iter: 34, loss min|avg|max: 0.728|0.884| 1.307, batch-p@3: 6.77%, ETA: 2:26:16 (0.35s/it) 2020-02-25 16:09:22,080 [INFO] train: iter: 35, loss min|avg|max: 0.718|0.905| 1.174, batch-p@3: 8.85%, ETA: 2:28:59 (0.36s/it) 2020-02-25 16:09:22,439 [INFO] train: iter: 36, loss min|avg|max: 0.720|0.931| 1.618, batch-p@3: 12.50%, ETA: 2:27:29 (0.35s/it) 2020-02-25 16:09:22,795 [INFO] train: iter: 37, loss min|avg|max: 0.731|0.982| 3.842, batch-p@3: 12.50%, ETA: 2:27:17 (0.35s/it) 2020-02-25 16:09:23,154 [INFO] train: iter: 38, loss min|avg|max: 0.734|0.879| 2.909, batch-p@3: 14.06%, ETA: 2:28:08 (0.36s/it) 2020-02-25 16:09:23,507 [INFO] train: iter: 39, loss min|avg|max: 0.699|0.912| 2.311, batch-p@3: 15.10%, ETA: 2:25:40 (0.35s/it) 2020-02-25 16:09:23,884 [INFO] train: iter: 40, loss min|avg|max: 0.697|0.910| 1.829, batch-p@3: 15.62%, ETA: 2:35:33 (0.37s/it) 2020-02-25 16:09:24,287 [INFO] train: iter: 41, loss min|avg|max: 0.721|0.864| 1.853, batch-p@3: 16.67%, ETA: 2:46:49 (0.40s/it) 2020-02-25 16:09:24,664 [INFO] train: iter: 42, loss min|avg|max: 0.729|0.861| 1.211, batch-p@3: 11.46%, ETA: 2:35:34 (0.37s/it) 2020-02-25 16:09:25,047 [INFO] train: iter: 43, loss min|avg|max: 0.727|0.911| 1.790, batch-p@3: 8.33%, ETA: 2:38:28 (0.38s/it) 2020-02-25 16:09:25,417 [INFO] train: iter: 44, loss min|avg|max: 0.715|0.839| 1.201, batch-p@3: 17.19%, ETA: 2:33:04 (0.37s/it) 2020-02-25 16:09:25,793 [INFO] train: iter: 45, loss min|avg|max: 0.755|0.902| 1.494, batch-p@3: 10.94%, ETA: 2:35:27 (0.37s/it) 2020-02-25 16:09:26,165 [INFO] train: iter: 46, loss min|avg|max: 0.700|0.866| 1.410, batch-p@3: 11.46%, ETA: 2:34:07 (0.37s/it) 2020-02-25 16:09:26,531 [INFO] train: iter: 47, loss min|avg|max: 0.714|0.811| 1.563, batch-p@3: 12.50%, ETA: 2:31:06 (0.36s/it) 2020-02-25 16:09:26,892 [INFO] train: iter: 48, loss min|avg|max: 0.650|0.792| 1.180, batch-p@3: 13.54%, ETA: 2:28:54 (0.36s/it) 2020-02-25 16:09:27,245 [INFO] train: iter: 49, loss min|avg|max: 0.685|0.848| 1.405, batch-p@3: 11.98%, ETA: 2:25:37 (0.35s/it) 2020-02-25 16:09:27,605 [INFO] train: iter: 50, loss min|avg|max: 0.710|0.901| 1.537, batch-p@3: 10.42%, ETA: 2:28:03 (0.36s/it) 2020-02-25 16:09:27,960 [INFO] train: iter: 51, loss min|avg|max: 0.720|0.840| 1.334, batch-p@3: 11.46%, ETA: 2:26:00 (0.35s/it) 2020-02-25 16:09:28,316 [INFO] train: iter: 52, loss min|avg|max: 0.705|0.815| 1.043, batch-p@3: 18.23%, ETA: 2:27:15 (0.35s/it) 2020-02-25 16:09:28,674 [INFO] train: iter: 53, loss min|avg|max: 0.711|0.847| 1.636, batch-p@3: 14.58%, ETA: 2:27:22 (0.35s/it) 2020-02-25 16:09:29,034 [INFO] train: iter: 54, loss min|avg|max: 0.733|0.883| 1.860, batch-p@3: 4.69%, ETA: 2:28:26 (0.36s/it) 2020-02-25 16:09:29,405 [INFO] train: iter: 55, loss min|avg|max: 0.704|0.830| 1.334, batch-p@3: 11.46%, ETA: 2:33:08 (0.37s/it) 2020-02-25 16:09:29,771 [INFO] train: iter: 56, loss min|avg|max: 0.734|0.841| 1.506, batch-p@3: 9.90%, ETA: 2:30:49 (0.36s/it) 2020-02-25 16:09:30,126 [INFO] train: iter: 57, loss min|avg|max: 0.707|0.845| 1.865, batch-p@3: 7.81%, ETA: 2:26:48 (0.35s/it) 2020-02-25 16:09:30,497 [INFO] train: iter: 58, loss min|avg|max: 0.718|0.854| 1.185, batch-p@3: 11.98%, ETA: 2:32:59 (0.37s/it) 2020-02-25 16:09:30,863 [INFO] train: iter: 59, loss min|avg|max: 0.718|0.800| 1.581, batch-p@3: 8.85%, ETA: 2:30:54 (0.36s/it) 2020-02-25 16:09:31,252 [INFO] train: iter: 60, loss min|avg|max: 0.728|0.820| 1.343, batch-p@3: 7.29%, ETA: 2:33:10 (0.37s/it) 2020-02-25 16:09:31,607 [INFO] train: iter: 61, loss min|avg|max: 0.725|0.796| 1.221, batch-p@3: 7.81%, ETA: 2:27:21 (0.35s/it) 2020-02-25 16:09:31,963 [INFO] train: iter: 62, loss min|avg|max: 0.704|0.768| 1.018, batch-p@3: 13.02%, ETA: 2:27:07 (0.35s/it) 2020-02-25 16:09:32,317 [INFO] train: iter: 63, loss min|avg|max: 0.686|0.807| 1.369, batch-p@3: 16.15%, ETA: 2:26:21 (0.35s/it) 2020-02-25 16:09:32,675 [INFO] train: iter: 64, loss min|avg|max: 0.724|0.827| 1.182, batch-p@3: 6.25%, ETA: 2:27:33 (0.36s/it) 2020-02-25 16:09:33,036 [INFO] train: iter: 65, loss min|avg|max: 0.719|0.785| 1.116, batch-p@3: 8.85%, ETA: 2:29:19 (0.36s/it) 2020-02-25 16:09:33,414 [INFO] train: iter: 66, loss min|avg|max: 0.712|0.801| 1.183, batch-p@3: 11.98%, ETA: 2:36:10 (0.38s/it) 2020-02-25 16:09:33,814 [INFO] train: iter: 67, loss min|avg|max: 0.723|0.800| 1.365, batch-p@3: 8.85%, ETA: 2:45:05 (0.40s/it) 2020-02-25 16:09:34,181 [INFO] train: iter: 68, loss min|avg|max: 0.703|0.781| 1.249, batch-p@3: 12.50%, ETA: 2:31:15 (0.36s/it) 2020-02-25 16:09:34,551 [INFO] train: iter: 69, loss min|avg|max: 0.711|0.800| 1.218, batch-p@3: 13.54%, ETA: 2:32:55 (0.37s/it) 2020-02-25 16:09:34,920 [INFO] train: iter: 70, loss min|avg|max: 0.722|0.809| 1.138, batch-p@3: 11.98%, ETA: 2:32:59 (0.37s/it) 2020-02-25 16:09:35,296 [INFO] train: iter: 71, loss min|avg|max: 0.716|0.796| 1.108, batch-p@3: 10.94%, ETA: 2:35:18 (0.37s/it) 2020-02-25 16:09:35,665 [INFO] train: iter: 72, loss min|avg|max: 0.688|0.784| 1.115, batch-p@3: 16.15%, ETA: 2:32:34 (0.37s/it) 2020-02-25 16:09:36,024 [INFO] train: iter: 73, loss min|avg|max: 0.719|0.800| 1.315, batch-p@3: 10.94%, ETA: 2:28:35 (0.36s/it) 2020-02-25 16:09:36,380 [INFO] train: iter: 74, loss min|avg|max: 0.714|0.792| 1.027, batch-p@3: 14.06%, ETA: 2:27:07 (0.35s/it) 2020-02-25 16:09:36,751 [INFO] train: iter: 75, loss min|avg|max: 0.710|0.778| 1.118, batch-p@3: 9.90%, ETA: 2:33:15 (0.37s/it) 2020-02-25 16:09:37,138 [INFO] train: iter: 76, loss min|avg|max: 0.699|0.776| 1.313, batch-p@3: 11.98%, ETA: 2:40:23 (0.39s/it) 2020-02-25 16:09:37,504 [INFO] train: iter: 77, loss min|avg|max: 0.721|0.809| 1.144, batch-p@3: 12.50%, ETA: 2:30:51 (0.36s/it) 2020-02-25 16:09:37,866 [INFO] train: iter: 78, loss min|avg|max: 0.729|0.801| 0.942, batch-p@3: 11.46%, ETA: 2:29:15 (0.36s/it) 2020-02-25 16:09:38,225 [INFO] train: iter: 79, loss min|avg|max: 0.697|0.782| 0.967, batch-p@3: 15.10%, ETA: 2:28:17 (0.36s/it) 2020-02-25 16:09:38,586 [INFO] train: iter: 80, loss min|avg|max: 0.710|0.815| 1.793, batch-p@3: 9.38%, ETA: 2:29:01 (0.36s/it) 2020-02-25 16:09:38,941 [INFO] train: iter: 81, loss min|avg|max: 0.707|0.826| 1.555, batch-p@3: 10.94%, ETA: 2:26:38 (0.35s/it)

And lots of iterations later:

2020-02-25 19:33:32,859 [INFO] train: iter: 8898, loss min|avg|max: 0.415|0.687| 1.047, batch-p@3: 56.25%, ETA: 1:37:59 (0.37s/it) 2020-02-25 19:33:33,222 [INFO] train: iter: 8899, loss min|avg|max: 0.656|0.787| 1.747, batch-p@3: 55.73%, ETA: 1:36:35 (0.36s/it) 2020-02-25 19:33:33,579 [INFO] train: iter: 8900, loss min|avg|max: 0.691|0.737| 1.346, batch-p@3: 54.17%, ETA: 1:34:59 (0.35s/it) 2020-02-25 19:33:33,940 [INFO] train: iter: 8901, loss min|avg|max: 0.691|0.826| 2.345, batch-p@3: 42.19%, ETA: 1:36:04 (0.36s/it) 2020-02-25 19:33:34,302 [INFO] train: iter: 8902, loss min|avg|max: 0.638|0.717| 0.965, batch-p@3: 58.33%, ETA: 1:36:22 (0.36s/it) 2020-02-25 19:33:34,670 [INFO] train: iter: 8903, loss min|avg|max: 0.353|0.677| 0.728, batch-p@3: 53.65%, ETA: 1:37:55 (0.36s/it) 2020-02-25 19:33:35,031 [INFO] train: iter: 8904, loss min|avg|max: 0.691|0.745| 1.200, batch-p@3: 54.69%, ETA: 1:36:09 (0.36s/it) 2020-02-25 19:33:35,397 [INFO] train: iter: 8905, loss min|avg|max: 0.693|0.777| 3.160, batch-p@3: 43.23%, ETA: 1:37:22 (0.36s/it) 2020-02-25 19:33:35,757 [INFO] train: iter: 8906, loss min|avg|max: 0.692|0.925| 4.062, batch-p@3: 37.50%, ETA: 1:36:02 (0.36s/it) 2020-02-25 19:33:36,120 [INFO] train: iter: 8907, loss min|avg|max: 0.692|0.804| 1.711, batch-p@3: 46.88%, ETA: 1:36:18 (0.36s/it) 2020-02-25 19:33:36,483 [INFO] train: iter: 8908, loss min|avg|max: 0.378|0.677| 0.697, batch-p@3: 72.40%, ETA: 1:36:34 (0.36s/it) 2020-02-25 19:33:36,878 [INFO] train: iter: 8909, loss min|avg|max: 0.693|0.744| 3.838, batch-p@3: 57.81%, ETA: 1:45:06 (0.39s/it) 2020-02-25 19:33:37,249 [INFO] train: iter: 8910, loss min|avg|max: 0.689|0.846| 3.782, batch-p@3: 59.38%, ETA: 1:38:25 (0.37s/it)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/VisualComputingInstitute/triplet-reid/issues/91?email_source=notifications&email_token=AAOJDTPIPAKD2FB6QWR3BKDREWNSZA5CNFSM4K3VBCTKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IQHCSDQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOJDTK2MIO5K4CZOOAVFD3REWNSZANCNFSM4K3VBCTA .

mazatov commented 4 years ago

Thanks @Pandoro for a prompt reply! 👍

You are correct, I didn't do anything special to load a pretrained network. I was wondering, did you mean to load the pretrained resnet_50v1 ? If it's not loaded automatically, where would I specify the weights and where should I put them?

Or did you mean start from the checkpoint you guys provide? In that case, I immediately get good results. I would like to figure out how to train it from scratch though.

2020-02-26 09:39:50,133 [INFO] train: Training using the following parameters:
2020-02-26 09:39:50,133 [INFO] train: batch_k: 4
2020-02-26 09:39:50,134 [INFO] train: batch_p: 16
2020-02-26 09:39:50,134 [INFO] train: checkpoint_frequency: 1000
2020-02-26 09:39:50,134 [INFO] train: crop_augment: False
2020-02-26 09:39:50,135 [INFO] train: decay_start_iteration: 15000
2020-02-26 09:39:50,135 [INFO] train: detailed_logs: False
2020-02-26 09:39:50,135 [INFO] train: embedding_dim: 128
2020-02-26 09:39:50,135 [INFO] train: experiment_root: experiments\official
2020-02-26 09:39:50,135 [INFO] train: flip_augment: False
2020-02-26 09:39:50,136 [INFO] train: head_name: fc1024
2020-02-26 09:39:50,136 [INFO] train: image_root: data\market1501
2020-02-26 09:39:50,136 [INFO] train: initial_checkpoint: checkpoint-25000
2020-02-26 09:39:50,136 [INFO] train: learning_rate: 0.0003
2020-02-26 09:39:50,137 [INFO] train: loading_threads: 8
2020-02-26 09:39:50,137 [INFO] train: loss: batch_hard
2020-02-26 09:39:50,137 [INFO] train: margin: soft
2020-02-26 09:39:50,137 [INFO] train: metric: euclidean
2020-02-26 09:39:50,137 [INFO] train: model_name: resnet_v1_50
2020-02-26 09:39:50,137 [INFO] train: net_input_height: 256
2020-02-26 09:39:50,138 [INFO] train: net_input_width: 128
2020-02-26 09:39:50,138 [INFO] train: pre_crop_height: 288
2020-02-26 09:39:50,138 [INFO] train: pre_crop_width: 144
2020-02-26 09:39:50,138 [INFO] train: resume: True
2020-02-26 09:39:50,138 [INFO] train: train_iterations: 30000
2020-02-26 09:39:50,138 [INFO] train: train_set: data\market1501\market1501_train.csv
2020-02-26 09:39:50,815 [WARNING] tensorflow: From train.py:250: unbatch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.unbatch()`.
2020-02-26 09:39:50,815 [WARNING] tensorflow: From train.py:250: unbatch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.unbatch()`.
2020-02-26 09:39:50,887 [INFO] tensorflow: Scale of 0 disables regularizer.
2020-02-26 09:39:50,887 [INFO] tensorflow: Scale of 0 disables regularizer.
2020-02-26 09:39:50,899 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2020-02-26 09:39:50,899 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2020-02-26 09:39:53,141 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2020-02-26 09:39:53,141 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2020-02-26 09:39:53,154 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
2020-02-26 09:39:53,154 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
2020-02-26 09:39:58,134 [INFO] train: Restoring from checkpoint: experiments\official\checkpoint-25000
2020-02-26 09:39:58,135 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\training\saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2020-02-26 09:39:58,135 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\training\saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2020-02-26 09:39:58,140 [INFO] tensorflow: Restoring parameters from experiments\official\checkpoint-25000
2020-02-26 09:39:58,140 [INFO] tensorflow: Restoring parameters from experiments\official\checkpoint-25000
2020-02-26 09:40:09,902 [INFO] train: Starting training from iteration 25000.
2020-02-26 09:40:17,434 [INFO] train: iter: 25001, loss min|avg|max: 0.000|0.000| 0.009, batch-p@3: 100.00%, ETA: 10:27:19 (7.53s/it)
2020-02-26 09:40:17,786 [INFO] train: iter: 25002, loss min|avg|max: 0.000|0.002| 0.069, batch-p@3: 100.00%, ETA: 0:28:59 (0.35s/it)
2020-02-26 09:40:18,135 [INFO] train: iter: 25003, loss min|avg|max: 0.000|0.001| 0.027, batch-p@3: 100.00%, ETA: 0:28:49 (0.35s/it)
2020-02-26 09:40:18,484 [INFO] train: iter: 25004, loss min|avg|max: 0.000|0.000| 0.002, batch-p@3: 100.00%, ETA: 0:28:48 (0.35s/it)
2020-02-26 09:40:18,835 [INFO] train: iter: 25005, loss min|avg|max: 0.000|0.001| 0.014, batch-p@3: 100.00%, ETA: 0:29:03 (0.35s/it)
2020-02-26 09:40:19,191 [INFO] train: iter: 25006, loss min|avg|max: 0.000|0.000| 0.017, batch-p@3: 100.00%, ETA: 0:29:23 (0.35s/it)
2020-02-26 09:40:19,544 [INFO] train: iter: 25007, loss min|avg|max: 0.000|0.001| 0.049, batch-p@3: 100.00%, ETA: 0:29:02 (0.35s/it)
2020-02-26 09:40:19,901 [INFO] train: iter: 25008, loss min|avg|max: 0.000|0.000| 0.000, batch-p@3: 100.00%, ETA: 0:29:27 (0.35s/it)
2020-02-26 09:40:20,259 [INFO] train: iter: 25009, loss min|avg|max: 0.000|0.000| 0.001, batch-p@3: 100.00%, ETA: 0:29:32 (0.36s/it)
2020-02-26 09:40:20,611 [INFO] train: iter: 25010, loss min|avg|max: 0.000|0.001| 0.033, batch-p@3: 100.00%, ETA: 0:29:02 (0.35s/it)
2020-02-26 09:40:20,961 [INFO] train: iter: 25011, loss min|avg|max: 0.000|0.000| 0.002, batch-p@3: 100.00%, ETA: 0:28:47 (0.35s/it)
2020-02-26 09:40:21,309 [INFO] train: iter: 25012, loss min|avg|max: 0.000|0.000| 0.002, batch-p@3: 100.00%, ETA: 0:28:41 (0.35s/it)
2020-02-26 09:40:21,666 [INFO] train: iter: 25013, loss min|avg|max: 0.000|0.002| 0.101, batch-p@3: 100.00%, ETA: 0:29:25 (0.35s/it)
2020-02-26 09:40:22,015 [INFO] train: iter: 25014, loss min|avg|max: 0.000|0.001| 0.040, batch-p@3: 100.00%, ETA: 0:28:41 (0.35s/it)
2020-02-26 09:40:22,367 [INFO] train: iter: 25015, loss min|avg|max: 0.000|0.000| 0.015, batch-p@3: 100.00%, ETA: 0:28:59 (0.35s/it)
2020-02-26 09:40:22,716 [INFO] train: iter: 25016, loss min|avg|max: 0.000|0.000| 0.010, batch-p@3: 100.00%, ETA: 0:28:44 (0.35s/it)
2020-02-26 09:40:23,071 [INFO] train: iter: 25017, loss min|avg|max: 0.000|0.000| 0.002, batch-p@3: 100.00%, ETA: 0:29:14 (0.35s/it)
2020-02-26 09:40:23,432 [INFO] train: iter: 25018, loss min|avg|max: 0.000|0.000| 0.005, batch-p@3: 100.00%, ETA: 0:29:43 (0.36s/it)
2020-02-26 09:40:23,810 [INFO] train: iter: 25019, loss min|avg|max: 0.000|0.002| 0.043, batch-p@3: 100.00%, ETA: 0:31:07 (0.38s/it)
2020-02-26 09:40:24,167 [INFO] train: iter: 25020, loss min|avg|max: 0.000|0.019| 1.083, batch-p@3: 99.48%, ETA: 0:29:23 (0.35s/it)
2020-02-26 09:40:24,537 [INFO] train: iter: 25021, loss min|avg|max: 0.000|0.000| 0.011, batch-p@3: 100.00%, ETA: 0:30:24 (0.37s/it)
2020-02-26 09:40:24,887 [INFO] train: iter: 25022, loss min|avg|max: 0.000|0.000| 0.004, batch-p@3: 100.00%, ETA: 0:28:50 (0.35s/it)
2020-02-26 09:40:25,248 [INFO] train: iter: 25023, loss min|avg|max: 0.000|0.000| 0.003, batch-p@3: 100.00%, ETA: 0:29:41 (0.36s/it)
2020-02-26 09:40:25,600 [INFO] train: iter: 25024, loss min|avg|max: 0.000|0.000| 0.006, batch-p@3: 100.00%, ETA: 0:28:57 (0.35s/it)
2020-02-26 09:40:25,974 [INFO] train: iter: 25025, loss min|avg|max: 0.000|0.000| 0.009, batch-p@3: 100.00%, ETA: 0:30:46 (0.37s/it)
2020-02-26 09:40:26,380 [INFO] train: iter: 25026, loss min|avg|max: 0.000|0.262| 3.625, batch-p@3: 94.27%, ETA: 0:33:19 (0.40s/it)
2020-02-26 09:40:26,859 [INFO] train: iter: 25027, loss min|avg|max: 0.000|0.000| 0.009, batch-p@3: 100.00%, ETA: 0:39:31 (0.48s/it)
2020-02-26 09:40:27,258 [INFO] train: iter: 25028, loss min|avg|max: 0.000|0.000| 0.008, batch-p@3: 100.00%, ETA: 0:32:44 (0.40s/it)
2020-02-26 09:40:27,654 [INFO] train: iter: 25029, loss min|avg|max: 0.000|0.002| 0.032, batch-p@3: 100.00%, ETA: 0:32:33 (0.39s/it)
2020-02-26 09:40:28,010 [INFO] train: iter: 25030, loss min|avg|max: 0.000|0.001| 0.020, batch-p@3: 100.00%, ETA: 0:29:14 (0.35s/it)
2020-02-26 09:40:28,362 [INFO] train: iter: 25031, loss min|avg|max: 0.000|0.000| 0.024, batch-p@3: 100.00%, ETA: 0:28:54 (0.35s/it)
2020-02-26 09:40:28,716 [INFO] train: iter: 25032, loss min|avg|max: 0.000|0.002| 0.052, batch-p@3: 100.00%, ETA: 0:29:04 (0.35s/it)
2020-02-26 09:40:29,074 [INFO] train: iter: 25033, loss min|avg|max: 0.000|0.002| 0.065, batch-p@3: 100.00%, ETA: 0:29:28 (0.36s/it)
2020-02-26 09:40:29,488 [INFO] train: iter: 25034, loss min|avg|max: 0.000|0.000| 0.004, batch-p@3: 100.00%, ETA: 0:34:00 (0.41s/it)
2020-02-26 09:40:29,857 [INFO] train: iter: 25035, loss min|avg|max: 0.000|0.000| 0.002, batch-p@3: 100.00%, ETA: 0:30:12 (0.37s/it)
2020-02-26 09:40:30,233 [INFO] train: iter: 25036, loss min|avg|max: 0.000|0.000| 0.003, batch-p@3: 100.00%, ETA: 0:30:56 (0.37s/it)
2020-02-26 09:40:30,585 [INFO] train: iter: 25037, loss min|avg|max: 0.000|0.000| 0.000, batch-p@3: 100.00%, ETA: 0:28:52 (0.35s/it)
Pandoro commented 4 years ago

Yes, I mean the imagenet pretrained checkpoint.

See the Readme for the instructions: https://github.com/VisualComputingInstitute/triplet-reid/blob/master/README.md#pre-trained-initialization

On Wed, Feb 26, 2020, 13:49 Mike Azatov notifications@github.com wrote:

Thanks @Pandoro https://github.com/Pandoro for a prompt reply! 👍

You are correct, I didn't do anything special to load a pretrained network. I was wondering, did you mean to load the pretrained resnet_50v1 ? If it's not loaded automatically, where would I specify the weights and where should I put them?

Or did you mean start from the checkpoint you guys provide? In that case, I immediately get good results. I would like to figure out how to train it from scratch though.

2020-02-26 09:39:50,133 [INFO] train: Training using the following parameters:

2020-02-26 09:39:50,133 [INFO] train: batch_k: 4

2020-02-26 09:39:50,134 [INFO] train: batch_p: 16

2020-02-26 09:39:50,134 [INFO] train: checkpoint_frequency: 1000

2020-02-26 09:39:50,134 [INFO] train: crop_augment: False

2020-02-26 09:39:50,135 [INFO] train: decay_start_iteration: 15000

2020-02-26 09:39:50,135 [INFO] train: detailed_logs: False

2020-02-26 09:39:50,135 [INFO] train: embedding_dim: 128

2020-02-26 09:39:50,135 [INFO] train: experiment_root: experiments\official

2020-02-26 09:39:50,135 [INFO] train: flip_augment: False

2020-02-26 09:39:50,136 [INFO] train: head_name: fc1024

2020-02-26 09:39:50,136 [INFO] train: image_root: data\market1501

2020-02-26 09:39:50,136 [INFO] train: initial_checkpoint: checkpoint-25000

2020-02-26 09:39:50,136 [INFO] train: learning_rate: 0.0003

2020-02-26 09:39:50,137 [INFO] train: loading_threads: 8

2020-02-26 09:39:50,137 [INFO] train: loss: batch_hard

2020-02-26 09:39:50,137 [INFO] train: margin: soft

2020-02-26 09:39:50,137 [INFO] train: metric: euclidean

2020-02-26 09:39:50,137 [INFO] train: model_name: resnet_v1_50

2020-02-26 09:39:50,137 [INFO] train: net_input_height: 256

2020-02-26 09:39:50,138 [INFO] train: net_input_width: 128

2020-02-26 09:39:50,138 [INFO] train: pre_crop_height: 288

2020-02-26 09:39:50,138 [INFO] train: pre_crop_width: 144

2020-02-26 09:39:50,138 [INFO] train: resume: True

2020-02-26 09:39:50,138 [INFO] train: train_iterations: 30000

2020-02-26 09:39:50,138 [INFO] train: train_set: data\market1501\market1501_train.csv

2020-02-26 09:39:50,815 [WARNING] tensorflow: From train.py:250: unbatch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version.

Instructions for updating:

Use tf.data.experimental.unbatch().

2020-02-26 09:39:50,815 [WARNING] tensorflow: From train.py:250: unbatch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version.

Instructions for updating:

Use tf.data.experimental.unbatch().

2020-02-26 09:39:50,887 [INFO] tensorflow: Scale of 0 disables regularizer.

2020-02-26 09:39:50,887 [INFO] tensorflow: Scale of 0 disables regularizer.

2020-02-26 09:39:50,899 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.

Instructions for updating:

Colocations handled automatically by placer.

2020-02-26 09:39:50,899 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.

Instructions for updating:

Colocations handled automatically by placer.

2020-02-26 09:39:53,141 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.

Instructions for updating:

Use tf.cast instead.

2020-02-26 09:39:53,141 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.

Instructions for updating:

Use tf.cast instead.

2020-02-26 09:39:53,154 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.

Instructions for updating:

Deprecated in favor of operator or tf.math.divide.

2020-02-26 09:39:53,154 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\ops\math_grad.py:102: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.

Instructions for updating:

Deprecated in favor of operator or tf.math.divide.

2020-02-26 09:39:58,134 [INFO] train: Restoring from checkpoint: experiments\official\checkpoint-25000

2020-02-26 09:39:58,135 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\training\saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.

Instructions for updating:

Use standard file APIs to check for files with this prefix.

2020-02-26 09:39:58,135 [WARNING] tensorflow: From C:\Users\mazat\Anaconda3\envs\tf_gpu\lib\site-packages\tensorflow\python\training\saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.

Instructions for updating:

Use standard file APIs to check for files with this prefix.

2020-02-26 09:39:58,140 [INFO] tensorflow: Restoring parameters from experiments\official\checkpoint-25000

2020-02-26 09:39:58,140 [INFO] tensorflow: Restoring parameters from experiments\official\checkpoint-25000

2020-02-26 09:40:09,902 [INFO] train: Starting training from iteration 25000.

2020-02-26 09:40:17,434 [INFO] train: iter: 25001, loss min|avg|max: 0.000|0.000| 0.009, batch-p@3: 100.00%, ETA: 10:27:19 (7.53s/it)

2020-02-26 09:40:17,786 [INFO] train: iter: 25002, loss min|avg|max: 0.000|0.002| 0.069, batch-p@3: 100.00%, ETA: 0:28:59 (0.35s/it)

2020-02-26 09:40:18,135 [INFO] train: iter: 25003, loss min|avg|max: 0.000|0.001| 0.027, batch-p@3: 100.00%, ETA: 0:28:49 (0.35s/it)

2020-02-26 09:40:18,484 [INFO] train: iter: 25004, loss min|avg|max: 0.000|0.000| 0.002, batch-p@3: 100.00%, ETA: 0:28:48 (0.35s/it)

2020-02-26 09:40:18,835 [INFO] train: iter: 25005, loss min|avg|max: 0.000|0.001| 0.014, batch-p@3: 100.00%, ETA: 0:29:03 (0.35s/it)

2020-02-26 09:40:19,191 [INFO] train: iter: 25006, loss min|avg|max: 0.000|0.000| 0.017, batch-p@3: 100.00%, ETA: 0:29:23 (0.35s/it)

2020-02-26 09:40:19,544 [INFO] train: iter: 25007, loss min|avg|max: 0.000|0.001| 0.049, batch-p@3: 100.00%, ETA: 0:29:02 (0.35s/it)

2020-02-26 09:40:19,901 [INFO] train: iter: 25008, loss min|avg|max: 0.000|0.000| 0.000, batch-p@3: 100.00%, ETA: 0:29:27 (0.35s/it)

2020-02-26 09:40:20,259 [INFO] train: iter: 25009, loss min|avg|max: 0.000|0.000| 0.001, batch-p@3: 100.00%, ETA: 0:29:32 (0.36s/it)

2020-02-26 09:40:20,611 [INFO] train: iter: 25010, loss min|avg|max: 0.000|0.001| 0.033, batch-p@3: 100.00%, ETA: 0:29:02 (0.35s/it)

2020-02-26 09:40:20,961 [INFO] train: iter: 25011, loss min|avg|max: 0.000|0.000| 0.002, batch-p@3: 100.00%, ETA: 0:28:47 (0.35s/it)

2020-02-26 09:40:21,309 [INFO] train: iter: 25012, loss min|avg|max: 0.000|0.000| 0.002, batch-p@3: 100.00%, ETA: 0:28:41 (0.35s/it)

2020-02-26 09:40:21,666 [INFO] train: iter: 25013, loss min|avg|max: 0.000|0.002| 0.101, batch-p@3: 100.00%, ETA: 0:29:25 (0.35s/it)

2020-02-26 09:40:22,015 [INFO] train: iter: 25014, loss min|avg|max: 0.000|0.001| 0.040, batch-p@3: 100.00%, ETA: 0:28:41 (0.35s/it)

2020-02-26 09:40:22,367 [INFO] train: iter: 25015, loss min|avg|max: 0.000|0.000| 0.015, batch-p@3: 100.00%, ETA: 0:28:59 (0.35s/it)

2020-02-26 09:40:22,716 [INFO] train: iter: 25016, loss min|avg|max: 0.000|0.000| 0.010, batch-p@3: 100.00%, ETA: 0:28:44 (0.35s/it)

2020-02-26 09:40:23,071 [INFO] train: iter: 25017, loss min|avg|max: 0.000|0.000| 0.002, batch-p@3: 100.00%, ETA: 0:29:14 (0.35s/it)

2020-02-26 09:40:23,432 [INFO] train: iter: 25018, loss min|avg|max: 0.000|0.000| 0.005, batch-p@3: 100.00%, ETA: 0:29:43 (0.36s/it)

2020-02-26 09:40:23,810 [INFO] train: iter: 25019, loss min|avg|max: 0.000|0.002| 0.043, batch-p@3: 100.00%, ETA: 0:31:07 (0.38s/it)

2020-02-26 09:40:24,167 [INFO] train: iter: 25020, loss min|avg|max: 0.000|0.019| 1.083, batch-p@3: 99.48%, ETA: 0:29:23 (0.35s/it)

2020-02-26 09:40:24,537 [INFO] train: iter: 25021, loss min|avg|max: 0.000|0.000| 0.011, batch-p@3: 100.00%, ETA: 0:30:24 (0.37s/it)

2020-02-26 09:40:24,887 [INFO] train: iter: 25022, loss min|avg|max: 0.000|0.000| 0.004, batch-p@3: 100.00%, ETA: 0:28:50 (0.35s/it)

2020-02-26 09:40:25,248 [INFO] train: iter: 25023, loss min|avg|max: 0.000|0.000| 0.003, batch-p@3: 100.00%, ETA: 0:29:41 (0.36s/it)

2020-02-26 09:40:25,600 [INFO] train: iter: 25024, loss min|avg|max: 0.000|0.000| 0.006, batch-p@3: 100.00%, ETA: 0:28:57 (0.35s/it)

2020-02-26 09:40:25,974 [INFO] train: iter: 25025, loss min|avg|max: 0.000|0.000| 0.009, batch-p@3: 100.00%, ETA: 0:30:46 (0.37s/it)

2020-02-26 09:40:26,380 [INFO] train: iter: 25026, loss min|avg|max: 0.000|0.262| 3.625, batch-p@3: 94.27%, ETA: 0:33:19 (0.40s/it)

2020-02-26 09:40:26,859 [INFO] train: iter: 25027, loss min|avg|max: 0.000|0.000| 0.009, batch-p@3: 100.00%, ETA: 0:39:31 (0.48s/it)

2020-02-26 09:40:27,258 [INFO] train: iter: 25028, loss min|avg|max: 0.000|0.000| 0.008, batch-p@3: 100.00%, ETA: 0:32:44 (0.40s/it)

2020-02-26 09:40:27,654 [INFO] train: iter: 25029, loss min|avg|max: 0.000|0.002| 0.032, batch-p@3: 100.00%, ETA: 0:32:33 (0.39s/it)

2020-02-26 09:40:28,010 [INFO] train: iter: 25030, loss min|avg|max: 0.000|0.001| 0.020, batch-p@3: 100.00%, ETA: 0:29:14 (0.35s/it)

2020-02-26 09:40:28,362 [INFO] train: iter: 25031, loss min|avg|max: 0.000|0.000| 0.024, batch-p@3: 100.00%, ETA: 0:28:54 (0.35s/it)

2020-02-26 09:40:28,716 [INFO] train: iter: 25032, loss min|avg|max: 0.000|0.002| 0.052, batch-p@3: 100.00%, ETA: 0:29:04 (0.35s/it)

2020-02-26 09:40:29,074 [INFO] train: iter: 25033, loss min|avg|max: 0.000|0.002| 0.065, batch-p@3: 100.00%, ETA: 0:29:28 (0.36s/it)

2020-02-26 09:40:29,488 [INFO] train: iter: 25034, loss min|avg|max: 0.000|0.000| 0.004, batch-p@3: 100.00%, ETA: 0:34:00 (0.41s/it)

2020-02-26 09:40:29,857 [INFO] train: iter: 25035, loss min|avg|max: 0.000|0.000| 0.002, batch-p@3: 100.00%, ETA: 0:30:12 (0.37s/it)

2020-02-26 09:40:30,233 [INFO] train: iter: 25036, loss min|avg|max: 0.000|0.000| 0.003, batch-p@3: 100.00%, ETA: 0:30:56 (0.37s/it)

2020-02-26 09:40:30,585 [INFO] train: iter: 25037, loss min|avg|max: 0.000|0.000| 0.000, batch-p@3: 100.00%, ETA: 0:28:52 (0.35s/it)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VisualComputingInstitute/triplet-reid/issues/91?email_source=notifications&email_token=AAOJDTOC6EIEIPM6V4VOZ23REZQPFA5CNFSM4K3VBCTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENADDHI#issuecomment-591409565, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOJDTKNERWPYV2JNXCYULDREZQPFANCNFSM4K3VBCTA .

mazatov commented 4 years ago

Thanks, I don't know how I missed that 🤦‍♂

loss

mazatov commented 4 years ago

I couldn't find your reported MAP scores anywhere. Does this look normal to you for Market dataset?

mAP: 65.72% | top-1: 81.71% top-2: 87.53% | top-5: 93.17% | top-10: 95.43%