yahoo / CaffeOnSpark

Distributed deep learning on Hadoop and Spark clusters.
Apache License 2.0
1.27k stars 358 forks source link

Where is the caffe output of test net and test loss #211

Open donglinjy opened 7 years ago

donglinjy commented 7 years ago

When we run caffe alone, we can get the test net output I1220 08:31:50.558744 2105 solver.cpp:317] Iteration 1000, loss = 0.0884654 *I1220 08:31:50.558794 2105 solver.cpp:337] Iteration 1000, Testing net (#0) I1220 08:31:50.863924 2105 solver.cpp:391] Test loss: 0.0866651 I1220 08:31:50.863971 2105 solver.cpp:404] Test net output #0: accuracy = 0.972 I1220 08:31:50.863983 2105 solver.cpp:404] Test net output #1: loss = 0.0866651 ( 1 = 0.0866651 loss)**

But in CaffeonSpark, instead of the caffe testint net output, we will get some thing cos generated by itself, *I1212 08:58:36.250149 14322 CaffeNet.cpp:36] Iteration 200, Testing net (#0) I1212 08:58:36.250187 14322 CaffeNet.cpp:55] Test net output #0: accuracy = 0.88 I1212 08:58:36.250206 14322 CaffeNet.cpp:55] Test net output #1: loss = 0.365099 ( 1 = 0.365099 loss)* I1212 08:58:36.384349 14322 solver.cpp:228] Iteration 200, loss = 0.161356 I1212 08:58:36.384402 14322 solver.cpp:244] Train net output #0: loss = 0.161356 ( 1 = 0.161356 loss) I1212 08:58:36.384413 14322 sgd_solver.cpp:106] Iteration 200, lr = 0.00985258

Any one know if this is a designed behavior that in CaffeonSpark, the caffe itself test net output will not print?

mriduljain commented 7 years ago

Looks like you are running cos in trainwithvalidation mode which produces interleaved output of test after every test interval defined in the solver proto txt . If you just want to train please set the test interval and test iteration to 0 in the solver file

On Tue, Dec 20, 2016 at 5:53 PM PigApple notifications@github.com wrote:

When we run caffe alone, we can get the test net output

I1220 08:31:50.558744 2105 solver.cpp:317] Iteration 1000, loss = 0.0884654

*I1220 08:31:50.558794 2105 solver.cpp:337] Iteration 1000, Testing net (#0)I1220 08:31:50.863924 2105 solver.cpp:391] Test loss: 0.0866651I1220 08:31:50.863971 2105 solver.cpp:404] Test net output #0: accuracy = 0.972I1220 08:31:50.863983 2105 solver.cpp:404] Test net output #1 https://github.com/yahoo/CaffeOnSpark/pull/1: loss = 0.0866651 ( 1 = 0.0866651 loss)**

But in CaffeonSpark, instead of the caffe testint net output, we will get some thing cos generated by itself,

*I1212 08:58:36.250149 14322 CaffeNet.cpp:36] Iteration 200, Testing net (#0)I1212 08:58:36.250187 14322 CaffeNet.cpp:55] Test net output #0: accuracy = 0.88I1212 08:58:36.250206 14322 CaffeNet.cpp:55] Test net output

1 https://github.com/yahoo/CaffeOnSpark/pull/1: loss = 0.365099 ( 1 =

0.365099 loss)**

I1212 08:58:36.384349 14322 solver.cpp:228] Iteration 200, loss = 0.161356

I1212 08:58:36.384402 14322 solver.cpp:244] Train net output #0: loss = 0.161356 (* 1 = 0.161356 loss)

I1212 08:58:36.384413 14322 sgd_solver.cpp:106] Iteration 200, lr = 0.00985258

Any one know if this is a designed behavior that in CaffeonSpark, the caffe itself test net output will not print?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yahoo/CaffeOnSpark/issues/211, or mute the thread https://github.com/notifications/unsubscribe-auth/ACCTVf1-RghqlmrbhLT1aOJB6vKOCVUJks5rKIaKgaJpZM4LSffs .

donglinjy commented 7 years ago

@mriduljain , thanks. But it is my intention to run with the test output and the issue is that when I run in that mode, I only get the CaffeonSpark test net output but the caffe original output is not in the log. Do you know how can get it?

junshi15 commented 7 years ago

We added a wrapper around some of the Caffe functions to fit in the Spark framework. The output you see from CoS should be equivalent to what you see from Caffe.