syfafterzy / SVDNet-for-Pedestrian-Retrieval

Code for https://arxiv.org/abs/1703.05693
138 stars 47 forks source link

double free or corruption (out) #13

Open PayneYong opened 6 years ago

PayneYong commented 6 years ago

执行完./train_basemodel.sh命令之后结果如下,再执行./train_RRI.sh的时候发现该步骤所需要的很多模型都没有,已经弄过两次了都是这个问题,自己的电脑训练一轮需要50小时左右,时间耗不起,有没有遇到过这个问题的能帮帮忙吗,感激不尽! I0516 15:38:58.631556 14802 sgd_solver.cpp:112] Iteration 14900, lr = 0.0001 I0516 15:47:50.826786 14802 solver.cpp:239] Iteration 14950 (0.0939505 iter/s, 532.195s/50 iters), loss = 3.34904 I0516 15:47:50.827069 14802 solver.cpp:258] Train net output #0: loss = 3.34904 ( 1 = 3.34904 loss) I0516 15:47:50.827105 14802 sgd_solver.cpp:112] Iteration 14950, lr = 0.0001 I0516 15:48:18.686805 14806 data_layer.cpp:73] Restarting data prefetching from start. I0516 15:56:01.867225 14802 solver.cpp:468] Snapshotting to binary proto file SVDNet/caffenet/linear/linear_iter_15000.caffemodel I0516 15:56:10.400070 14802 sgd_solver.cpp:280] Snapshotting solver state to binary proto file SVDNet/caffenet/linear/linear_iter_15000.solverstate I0516 15:56:15.440939 14802 solver.cpp:331] Iteration 15000, loss = 4.1492 I0516 15:56:15.441035 14802 solver.cpp:351] Iteration 15000, Testing net (#0) I0516 15:57:14.810365 14802 solver.cpp:418] Test net output #0: accuracy = 0.003 I0516 15:57:14.812616 14802 solver.cpp:418] Test net output #1: loss = 12.7851 ( 1 = 12.7851 loss) I0516 15:57:14.812937 14802 solver.cpp:336] Optimization Done. I0516 15:57:14.812999 14802 caffe.cpp:250] Optimization Done. double free or corruption (out) Aborted at 1526457435 (unix time) try "date -d @1526457435" if you are using GNU date PC: @ 0x7fd68e24ce97 gsignal SIGABRT (@0x3e8000039d2) received by PID 14802 (TID 0x7fd69059ae80) from PID 14802; stack trace: @ 0x7fd68e24cf20 (unknown) @ 0x7fd68e24ce97 gsignal @ 0x7fd68e24e801 abort @ 0x7fd68e297897 (unknown) @ 0x7fd68e29e90a (unknown) @ 0x7fd68e2a5e75 cfree @ 0x7fd68ffb66f2 boost::detail::sp_counted_impl_p<>::dispose() @ 0x7fd68ffd1b32 boost::detail::sp_counted_impl_p<>::dispose() @ 0x55f851fa59c5 (unknown) @ 0x7fd68ffc82d2 boost::detail::sp_counted_impl_p<>::dispose() @ 0x7fd6900dd489 caffe::SGDSolver<>::~SGDSolver() @ 0x55f851f9f6ea (unknown) @ 0x55f851f9b000 (unknown) @ 0x7fd68e22fb97 __libc_start_main @ 0x55f851f9baba (unknown) ./train_basemodel.sh: 行 13: 14802 已放弃 (核心已转储) ./build/tools/caffe train -solver SVDNet/caffenet/models/solver_linear.prototxt -weights ${BasemodelPath}

ghost commented 6 years ago

看一下你的linear_iter_15000.caffemodel保存了吗。

另外,看loss的话你的模型没有收敛。

可以看一下这个https://github.com/BVLC/caffe/issues/1333

训练Caffe版本下的IDE,可以参考这个project.https://github.com/zhunzhong07/IDE-baseline-Market-1501. 祝你顺利。

PayneYong commented 6 years ago

多谢,本人是个新手,感谢有人能回答我这个问题。 linear_iter_15000.caffemodel这个保存下来了,都是用的这儿的代码,修改过的地方只有路径,GPU改CPU,模型都是他提供的百度云链接下载的。模型没有收敛是不是我的图片生成的有问题,生成lmdb文件的时候需要个train.txt,文件内容如下: 0010_c3s3_076044_05.jpg 0010_c5s3_076487_02.jpg 0011_c6s4_002527_05.jpg 0022_c6s1_004076_01.jpg ... 我这个是不是有问题啊?

Simon4Yan commented 6 years ago

train.txt I give you my train.txt. Good luck to you!

PayneYong commented 6 years ago

Thank you very much! What's the meaning of 425? The person label? What about 0810? dataset/bounding_box_train/0810_c4s4_037210_01.jpg 425

Simon4Yan commented 6 years ago

Actually, you can view this web for more details about the data set, including the meanings of image name. Good luck to you.