snuspl / dolphin

14 stars 2 forks source link

Makes neural network load configuration from a protocol buffer text formatted file. #88

Closed beomyeol closed 9 years ago

beomyeol commented 9 years ago

I've implemented the feature that a neural network model loads the configuration from a file. I assumed that the configuration file is stored in Google protocol buffer text format. The protocol buffer message definition is located in src/main/proto/neural_network.proto. I used the configuration file architecture and variable names that are similar to Caffe's ones.

Here's the example of the neural network configuration that uses all the options which are implemented until now.

batch_size: 10
stepsize: 1e-2
parameter_provider {
  type: "local"
}
layer {
  type: "FullyConnected"
  num_input: 784
  num_output: 50
  fully_connected_param {
    random_seed: 10
    init_weight: 1e-4
    init_bias: 2e-4
    activation_function: "sigmoid"
  }
}
layer {
  type: "FullyConnected"
  num_input: 50
  num_output: 10
  fully_connected_param {
    random_seed: 10
    init_weight: 1e-2
    init_bias: 2e-2
    activation_function: "sigmoid"
  }
}

Closes #83.

dongjoon-hyun commented 9 years ago

I think you designed a mapping from InnerProduct of caffe into fully_connected_param of dolphin right?

https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto

It looks good to me.

beomyeol commented 9 years ago

@dongjoon-hyun, Yes it is except one that InnerProduct of Caffe does not have activation function but ours has that.

For your information, I haven't committed the run script with this pull request. Instead, I am gonna push other branch dnn_run_script that includes the run script, sample configuration, and sample dataset (a sample of MNIST database). If you'd like to run, please use that branch :)

dongjoon-hyun commented 9 years ago

Thank you, @beomyeol . I'll do that.

By the way, for the NN testcase, I also proposed #64 (AutoEncoder). How do you think about this instead of MNIST? AutoEncoder does not need any resource files.

dongjoon-hyun commented 9 years ago

@beomyeol , In dnn_run_script branch, the resources file is too verbose. Could you get some lines from the following files?

Hmm, maybe not. What I wanted is to use 'integer' format for values. As you know, mnist data is integer.

beomyeol commented 9 years ago

@dongjoon-hyun, Thank you for letting me know AutoEncoder.

For the resource file in dnn_run_script, I agree that it is too verbose. I followed the format that deeplearning4j uses. If you're okay, I'll change it more concise.

dongjoon-hyun commented 9 years ago

Thank you!

By the way, could you do the following command? When I give 100 as an interation number, the job is terminated in FORCE_CLOSED by signal: SIG_TERMINATE.

./run_nn.sh -local true -maxIter 100 -conf sample_nn_conf -input sample_nn_data
beomyeol commented 9 years ago

@dongjoon-hyun, I think that's because of timeout. Please increase timeout value and try it again. :)

beomyeol commented 9 years ago

@dongjoon-hyun, I change the resource file in dnn_run_script. Please take a look! Thank you for your comment.

I'd like to change the default delimiter of DNN input file from ' '(3 whitespaces) to ,(comma) like the resource file in dnn_run_script. Please let me know if you're okay.

dongjoon-hyun commented 9 years ago

@beomyeol , thank you for timeout. And, for the default delimiter, it's okay. By the way, is it possible to rebase dnn_run_script, too?

dongjoon-hyun commented 9 years ago

Hi, @beomyeol . Could you change maxIter as 100 instead of 10? For algorithm, the accuracy goes about 82% (with iteration 100, stepsize: 1e-3). I think it's okay for now.

Also, Data looks good. It's evenly sampled. (Thank you again.)

$ awk -F, '{print $785}' sample_nn_data | sort | uniq -c
  17 
 109 0
 114 1
 114 2
 113 3
 108 4
  96 5
 119 6
 118 7
 104 8
 105 9

I'm finished my review for this issue! @jsjason , could you review this too?

jsjason commented 9 years ago

Yes, I'd like to do a pass too.

dongjoon-hyun commented 9 years ago

Thank you! @jsjason .

beomyeol commented 9 years ago

@dongjoon-hyun, I've changed the default delimiter to , , maxIter to 100, and stepsize to 1e-3. I've also added timeout option for EXAMPLE USAGE in run_nn.sh. In addition, it is possible to rebase dnn_run_script. @jsjason, I'll rebase it if you're okay. Please let me know.

dongjoon-hyun commented 9 years ago

Thank you, @beomyeol . LGTM!

jsjason commented 9 years ago

@dongjoon-hyun @beomyeol What do you mean by 'rebase dnn_run_script'? This branch doesn't have any merge conflicts with master. Did you mean 'include the commits of dnn_run_script here', perhaps?

beomyeol commented 9 years ago

@jsjason, I addressed some comments that you left. I left comments about the parts I haven't changed them yet.

For 'rebase', what I meant is what you said.

jsjason commented 9 years ago

It seems a few comments from each of us have gone unnoticed.

Implementing an autoencoder seems great, but I'd prefer we add a new PR instead of replacing this one. The issues addressed by this PR has itstheir own good values.

jsjason commented 9 years ago

The branch dnn_run_script looks ready to be merged with this PR, if @beomyeol and @dongjoon-hyun are both okay with that.

dongjoon-hyun commented 9 years ago

Please merge! :)

beomyeol commented 9 years ago

I am also okay for merging dnn_run_script. :)

beomyeol commented 9 years ago

@jsjason, @dongjoon-hyun. I included all the commits in dnn_run_script and addressed the comments.

For you information, maven-protoc-plugin is not available through Maven Central (https://github.com/sergei-ivanov/maven-protoc-plugin/blob/master/README.md). So I added the plugin repository for this.

dongjoon-hyun commented 9 years ago

Sorry, but the side effect is not proper for me. @jsjason , should we use another repository for just that plugin? I think the way of REEF is reasonable.

dongjoon-hyun commented 9 years ago

Moreover, I met the following error message with build error.

[ERROR] Plugin com.google.protobuf.tools:maven-protoc-plugin:0.4.2 or one of its dependencies could not be resolved: Failure to find com.google.protobuf.tools:maven-protoc-plugin:jar:0.4.2 in http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/ was cached in the local repository, resolution will not be reattempted until the update interval of protoc-plugin has elapsed or updates are forced -> [Help 1]
dongjoon-hyun commented 9 years ago

Could you move bin/sample_nn_data into src/test/resources/data/? In the master branch, all data resouce files are moved there already.

jsjason commented 9 years ago

@beomyeol, please double-check that the build and tests succeed. I didn't know the plugin wasn't on maven, though I think it wouldn't hurt to have an extra dependency. After all, using maven central is a gigantic dependency by itself. Since @beomyeol is the one who did the work, I think we should ask his opinion too. @beomyeol, do you prefer doing a little more debugging, or simply reverting that last commit? And @dongjoon-hyun, please let us know if you have any more problems other than the build failure.

dongjoon-hyun commented 9 years ago

@jsjason . I told repository, not dependency. It has a huge difference.

Moreover, I hope dolphin has the consistency with REEF in various perspective for developers. In fact, I can not find any benefit to add extra dependency. Why should we be different from REEF?

Sorry, but I cannot agree with the way of adding extra repository just in order to clean-up ant legacy.

beomyeol commented 9 years ago

@dongjoon-hyun. Sorry for your inconvenience. I didn't have build problem that you experienced. I'll find the way to fix it. In addition, I'll move sample_nn_data to src/test/resource/data/ @jsjason. It is not problem to do debugging or reverting the commit. Acutally, I hesitated before using maven-protoc-plugin because of the extra plugin repository. I prefer not using extra dependencies since they are not authorized.

dongjoon-hyun commented 9 years ago

Hi, @beomyeol . When I run mvn protoc:compile, it turns out that it fails to download that that from github.com. Did you clone that github and build locally?

$ mvn protoc:compile
[INFO] Scanning for projects...
[WARNING] 
[WARNING] Some problems were encountered while building the effective model for edu.snu.reef:dolphin:jar:0.1-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.codehaus.mojo:build-helper-maven-plugin is missing. @ line 252, column 15
[WARNING] 
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING] 
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING] 
[WARNING] The POM for com.google.protobuf.tools:maven-protoc-plugin:jar:0.4.2 is missing, no dependency information available
[WARNING] Failed to retrieve plugin descriptor for com.google.protobuf.tools:maven-protoc-plugin:0.4.2: Plugin com.google.protobuf.tools:maven-protoc-plugin:0.4.2 or one of its dependencies could not be resolved: Failure to find com.google.protobuf.tools:maven-protoc-plugin:jar:0.4.2 in http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/ was cached in the local repository, resolution will not be reattempted until the update interval of protoc-plugin has elapsed or updates are forced
Downloading: http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/org/codehaus/mojo/maven-metadata.xml
Downloading: http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/org/apache/maven/plugins/maven-metadata.xml
Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-metadata.xml
Downloading: https://repo.maven.apache.org/maven2/org/codehaus/mojo/maven-metadata.xml
Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-metadata.xml (13 KB at 9.5 KB/sec)
Downloaded: https://repo.maven.apache.org/maven2/org/codehaus/mojo/maven-metadata.xml (20 KB at 11.3 KB/sec)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3.322 s
[INFO] Finished at: 2015-08-29T16:35:58+09:00
[INFO] Final Memory: 11M/245M
[INFO] ------------------------------------------------------------------------
[ERROR] No plugin found for prefix 'protoc' in the current project and in the plugin groups [org.apache.maven.plugins, org.codehaus.mojo] available from the repositories [local (/Users/dongjoon/.m2/repository), protoc-plugin (http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/), central (https://repo.maven.apache.org/maven2)] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/NoPluginFoundForPrefixException
beomyeol commented 9 years ago

@dongjoon-hyun. No, I didn't. Could you change the part of pom.xml as following and try it again?

Before

<pluginRepository>
  <id>protoc-plugin</id>
  <url>http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/</url>
</pluginRepository>

To

<pluginRepository>
  <id>protoc-plugin</id>
  <url>https://dl.bintray.com/sergei-ivanov/maven/</url>
</pluginRepository>

Sorry for you inconvenience again.

dongjoon-hyun commented 9 years ago

Thank you, @beomyeol . It works now. I also checked in my Mac and in Ubuntu. It's time to decide, @jsjason . I'll follow your decision anyway since it's a project owned by SNU.

jsjason commented 9 years ago

Two out of three prefer using ant over the new plugin, so let's just go with ant. Sorry @beomyeol, but it looks like you'll have to revert the commit 5866d66.

dongjoon-hyun commented 9 years ago

Sorry and thank you, @beomyeol and @jsjason .

bgchun commented 9 years ago

@beomyeol @jsjason @dongjoon-hyun Thanks for resolving the protoc issue. I was about to ask to be consistent with REEF. :)

bgchun commented 9 years ago

Feel free to merge it once pom.xml is changed.

beomyeol commented 9 years ago

@bgchun @dongjoon-hyun @jsjason. I reverted the commit 5866d66 that is the modification of maven-protoc-plugin.

@dongjoon-hyun. I moved sample_nn_data to src/test/resources/data and sample_nn_conf to src/test/resources/configuration. Also, I renamed the run script, data, configuration file to use neuralnet instead of nn. I think neuralnet is more clear. Please take a look at this :)

dongjoon-hyun commented 9 years ago

Thank you, @beomyeol . I finished review. Test passed and everything looks good except the above one line.

beomyeol commented 9 years ago

@dongjoon-hyun. Thank you. I addressed you comment.

dongjoon-hyun commented 9 years ago

@jsjason , could you review and merge this then?

jsjason commented 9 years ago

I didn't get a chance to review the new parts, @dongjoon-hyun.

dongjoon-hyun commented 9 years ago

Sorry, @jsjason . But It's just a roll-back to your reviewed version. I thought you agreeed because you're quiet 15 hours after @bgchun says 'Feel free to merge it once pom.xml is changed.'

Last but not least, I hesitated to text you on sunday. Anyway, you can revert any change if you want. I didn't delete the branch feature-83-dnn_conf.