Closed zhangqiudan closed 8 years ago
@CLT29 When I use cpu model to train, it success. But it down when I use gpu model.
This seems to be an issue with caffe. Did you check the solver prototxt and make the snapshot directory?
https://github.com/BVLC/caffe/issues/1394
-chris
-------- Original message -------- From: zhangqiudan Date:11/21/2016 8:30 PM (GMT-05:00) To: CLT29/OpenSALICON Subject: [CLT29/OpenSALICON] Check failed: proto.SerializeToOstream(&output) (#2)
Hi, everyone. I used salicon-net (finetune-salicon.py)to try to train a new model over my own dataset. However, There is a error when I train my model. error: I1122 09:19:42.497810 16306 net.cpp:761] Ignoring source layer sec_pool5 I1122 09:19:42.497845 16306 net.cpp:761] Ignoring source layer custom_interpolation_layer working on 0 of 600 I1122 09:19:42.906407 16306 solver.cpp:228] Iteration 0, loss = 3359.02 I1122 09:19:42.906457 16306 solver.cpp:244] Train net output #0: loss = 3359.02 (* 1 = 3359.02 loss) I1122 09:19:42.906472 16306 sgd_solver.cpp:106] Iteration 0, lr = 1e-06 F1122 09:19:43.967656 16306 io.cpp:69] Check failed: proto.SerializeToOstream(&output) * Check failure stack trace: * Aborted (core dumped) I'm looking forward to your reply. Thank you so much
You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCLT29%2FOpenSALICON%2Fissues%2F2&data=01%7C01%7Cchris%40cs.pitt.edu%7C0b64ab1c62414ed47dbd08d412772001%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=DZg5tDEVv1TIZKDDyZuLngKJ5OvC9TudmTBMcTQ5akA%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAIhyBVTvVd98-Ub5VRNV65AAJ5q2-7YFks5rAkWJgaJpZM4K455H&data=01%7C01%7Cchris%40cs.pitt.edu%7C0b64ab1c62414ed47dbd08d412772001%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=KYksabragUdVtWNBujYlG519lvzTlY%2Fe%2BHGeDLAXUHI%3D&reserved=0.
I have checked the solver prototxt and made the snapshot directory , but it still occurred this error. And then I used the original prototxt downloaded from the Github and set cpu pattern to run, it success.
@CLT29
Do I need to change the original prototxt to savethe caffemodel ?
Can you post your solver prototxt file? -chris
From: zhangqiudan [mailto:notifications@github.com] Sent: Tuesday, November 22, 2016 1:56 AM To: CLT29/OpenSALICON Cc: Chris Thomas; Mention Subject: Re: [CLT29/OpenSALICON] Check failed: proto.SerializeToOstream(&output) (#2)
Do I need to change the original prototxt to savethe caffemodel ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCLT29%2FOpenSALICON%2Fissues%2F2%23issuecomment-262162529&data=01%7C01%7Cchris%40cs.pitt.edu%7C8d0fdeb0b5454786426508d412a4a635%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=xs6cVWhF54MBAzCDeahSR%2BXPGmIJfQYxF70MSFCUZMI%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAIhyBTAEWJD1NdVZq3C3ZoOx3VmX4xPtks5rApIMgaJpZM4K455H&data=01%7C01%7Cchris%40cs.pitt.edu%7C8d0fdeb0b5454786426508d412a4a635%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=qgugsWbTNZD24Xp2tpPK%2F7%2FCL33F60HHomQospIM2Cg%3D&reserved=0.
By the way, you have verified that you have enough disk space to save the snapshots (which can be 1GB each)?
-chris
From: zhangqiudan [mailto:notifications@github.com] Sent: Tuesday, November 22, 2016 1:56 AM To: CLT29/OpenSALICON Cc: Chris Thomas; Mention Subject: Re: [CLT29/OpenSALICON] Check failed: proto.SerializeToOstream(&output) (#2)
Do I need to change the original prototxt to savethe caffemodel ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCLT29%2FOpenSALICON%2Fissues%2F2%23issuecomment-262162529&data=01%7C01%7Cchris%40cs.pitt.edu%7C8d0fdeb0b5454786426508d412a4a635%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=xs6cVWhF54MBAzCDeahSR%2BXPGmIJfQYxF70MSFCUZMI%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAIhyBTAEWJD1NdVZq3C3ZoOx3VmX4xPtks5rApIMgaJpZM4K455H&data=01%7C01%7Cchris%40cs.pitt.edu%7C8d0fdeb0b5454786426508d412a4a635%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=qgugsWbTNZD24Xp2tpPK%2F7%2FCL33F60HHomQospIM2Cg%3D&reserved=0.
yeah, I have verified the disk space is enough.
Solver:
train_net: "finetune_salicon.prototxt"
base_lr: 0.000001
lr_policy: "step"
gamma: 0.1
stepsize: 4000
display: 20
iter_size: 1 momentum: 0.9 weight_decay: 0.0005
snapshot: 0
snapshot_prefix: "finetuned_salicon"
solver_mode: GPU solver_type: ADADELTA delta: 1e-6 I did not change the solver prototxt.
Can you confirm if this still happens if you remove the solver_type: ADADELTA and delta: 1e-6 lines? -chris
From: zhangqiudan [mailto:notifications@github.com] Sent: Tuesday, November 22, 2016 3:50 AM To: CLT29/OpenSALICON Cc: Chris Thomas; Mention Subject: Re: [CLT29/OpenSALICON] Check failed: proto.SerializeToOstream(&output) (#2)
yeah, I have verified the disk space is enough. Solver: train_net: "finetune_salicon.prototxt" base_lr: 0.000001 lr_policy: "step" gamma: 0.1 stepsize: 4000 display: 20
iter_size: 1 momentum: 0.9 weight_decay: 0.0005
We disable standard caffe solver snapshotting and implement our own snapshot function
snapshot: 0
We still use the snapshot prefix, though
snapshot_prefix: "finetuned_salicon"
solver_mode: GPU solver_type: ADADELTA delta: 1e-6 I did not change the solver prototxt.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCLT29%2FOpenSALICON%2Fissues%2F2%23issuecomment-262181611&data=01%7C01%7Cchris%40cs.pitt.edu%7C7e2028ab481e42dc82ba08d412b48810%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=z2LccVQukpjhJS24rbaK%2B3ULQOrGbWGokPhSZUaRmd0%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAIhyBega6bYjKVxs2CvAodp87Vkbgg74ks5rAqyxgaJpZM4K455H&data=01%7C01%7Cchris%40cs.pitt.edu%7C7e2028ab481e42dc82ba08d412b48810%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=JZUFKGpZxkTWS%2B37oXzJvsogLk9FTr1Bc8CDwJS60wY%3D&reserved=0.
I just tried to remove the solver_type, It can run successfully. Thank you very much for solving the problem for me. Best wishes to you. by qiudan
You may want to try adjusting the learning rate (maybe slightly increasing it), since now you will be using SGD instead of ADADELTA. Be careful to monitor the loss to make sure the loss doesn't diverge and decreases if you change the learning rate. You may want to also try a different solver_type, such as ADAM.
Good luck. You may want to report this problem to Caffe, I do not know what causes this for you.
-chris
-------- Original message -------- From: zhangqiudan Date:11/22/2016 3:59 AM (GMT-05:00) To: CLT29/OpenSALICON Cc: Chris Thomas , Mention Subject: Re: [CLT29/OpenSALICON] Check failed: proto.SerializeToOstream(&output) (#2)
I just tried to remove the solver_type, It can run successfully. Thank you very much for solving the problem for me. Best wishes to you. by qiudan
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCLT29%2FOpenSALICON%2Fissues%2F2%23issuecomment-262183522&data=01%7C01%7Cchris%40cs.pitt.edu%7C44b07410bbff4e606b8a08d412b5e72a%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=XKarp6Q%2Bo7rWOH9IwjgPR6mh8SvFOXLa%2FQLRH93f09E%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAIhyBeW2d65CmC2WMy_BvlaCVBeCEHx5ks5rAq78gaJpZM4K455H&data=01%7C01%7Cchris%40cs.pitt.edu%7C44b07410bbff4e606b8a08d412b5e72a%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1&sdata=Lxq8cKR6RLTbqh2xa1Qna8NrM28zNWdHW%2Fij1zs8DQ8%3D&reserved=0.
ok, I will do that. Thanks for your advises. And I will try to learn a model over my own dataset. Best wishes.
Hi, everyone. I used salicon-net (finetune-salicon.py)to try to train a new model over my own dataset. However, There is a error when I train my model. error: I1122 09:19:42.497810 16306 net.cpp:761] Ignoring source layer sec_pool5 I1122 09:19:42.497845 16306 net.cpp:761] Ignoring source layer custom_interpolation_layer working on 0 of 600 I1122 09:19:42.906407 16306 solver.cpp:228] Iteration 0, loss = 3359.02 I1122 09:19:42.906457 16306 solver.cpp:244] Train net output #0: loss = 3359.02 (* 1 = 3359.02 loss) I1122 09:19:42.906472 16306 sgd_solver.cpp:106] Iteration 0, lr = 1e-06 F1122 09:19:43.967656 16306 io.cpp:69] Check failed: proto.SerializeToOstream(&output) Check failure stack trace: Aborted (core dumped) I'm looking forward to your reply. Thank you so much