I encounter some problems during training with my own dataset. I've successfully build up an annotation file and LMDB, but during training, "meta.write_numbe"r is always 0, even total loss is down to about 20. Does it mean I didn't train successfully even 1 image? Since there is no error message, how can I debug it?
the following is my training log:
I0212 15:30:31.223089 1433 caffe.cpp:217] Using GPUs 1
I0212 15:30:31.272265 1433 caffe.cpp:222] GPU 1: GeForce GTX 750 Ti
I0212 15:30:31.502509 1433 solver.cpp:48] Initializing solver from parameters:
base_lr: 4e-05
display: 5
max_iter: 600000
lr_policy: "step"
gamma: 0.333
momentum: 0.9
weight_decay: 0.0005
stepsize: 136106
snapshot: 2000
snapshot_prefix: "/usr/local/RMPPE/training/model_allan/"
solver_mode: GPU
device_id: 1
net: "pose_train_test.prototxt"
train_state {
level: 0
stage: ""
}
I0212 15:30:31.502662 1433 solver.cpp:91] Creating training net from net file: pose_train_test.prototxt
I0212 15:30:31.505437 1433 net.cpp:58] Initializing net from parameters:
.
.
.
I0213 13:00:19.271257 1442 cpm_data_transformer.cpp:73] dataset: COCO; img_size: [720 x 960]; meta.annolist_index: 7; meta.write_number: 0; meta.total_write_number: 12; meta.epoch: 7883
I0213 13:00:29.130000 1442 cpm_data_transformer.cpp:73] dataset: COCO; img_size: [720 x 960]; meta.annolist_index: 7; meta.write_number: 0; meta.total_write_number: 12; meta.epoch: 7884
I0213 13:00:33.296797 1433 solver.cpp:228] Iteration 23650, loss = 22.878
Hi! Thanks for your brilliant work!
I encounter some problems during training with my own dataset. I've successfully build up an annotation file and LMDB, but during training, "meta.write_numbe"r is always 0, even total loss is down to about 20. Does it mean I didn't train successfully even 1 image? Since there is no error message, how can I debug it?
the following is my training log: I0212 15:30:31.223089 1433 caffe.cpp:217] Using GPUs 1 I0212 15:30:31.272265 1433 caffe.cpp:222] GPU 1: GeForce GTX 750 Ti I0212 15:30:31.502509 1433 solver.cpp:48] Initializing solver from parameters: base_lr: 4e-05 display: 5 max_iter: 600000 lr_policy: "step" gamma: 0.333 momentum: 0.9 weight_decay: 0.0005 stepsize: 136106 snapshot: 2000 snapshot_prefix: "/usr/local/RMPPE/training/model_allan/" solver_mode: GPU device_id: 1 net: "pose_train_test.prototxt" train_state { level: 0 stage: "" } I0212 15:30:31.502662 1433 solver.cpp:91] Creating training net from net file: pose_train_test.prototxt I0212 15:30:31.505437 1433 net.cpp:58] Initializing net from parameters: . . . I0213 13:00:19.271257 1442 cpm_data_transformer.cpp:73] dataset: COCO; img_size: [720 x 960]; meta.annolist_index: 7; meta.write_number: 0; meta.total_write_number: 12; meta.epoch: 7883 I0213 13:00:29.130000 1442 cpm_data_transformer.cpp:73] dataset: COCO; img_size: [720 x 960]; meta.annolist_index: 7; meta.write_number: 0; meta.total_write_number: 12; meta.epoch: 7884 I0213 13:00:33.296797 1433 solver.cpp:228] Iteration 23650, loss = 22.878
Thanks for your help!