Closed beomyeol closed 9 years ago
I think you designed a mapping from InnerProduct
of caffe
into fully_connected_param
of dolphin
right?
https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto
It looks good to me.
@dongjoon-hyun, Yes it is except one that InnerProduct
of Caffe
does not have activation function but ours has that.
For your information, I haven't committed the run script with this pull request. Instead, I am gonna push other branch dnn_run_script
that includes the run script, sample configuration, and sample dataset (a sample of MNIST database). If you'd like to run, please use that branch :)
Thank you, @beomyeol . I'll do that.
By the way, for the NN testcase, I also proposed #64 (AutoEncoder). How do you think about this instead of MNIST? AutoEncoder does not need any resource files.
@beomyeol , In dnn_run_script
branch, the resources file is too verbose. Could you get some lines from the following files?
mnist_train.csv
https://app.box.com/s/o9onnur8t4dolif43nxyjybxykw5mduqmnist_test.csv
https://app.box.com/s/bcg54o19nvj0qtfnxrwhdaoi8gue4sl7Hmm, maybe not. What I wanted is to use 'integer' format for values. As you know, mnist
data is integer.
@dongjoon-hyun, Thank you for letting me know AutoEncoder.
For the resource file in dnn_run_script
, I agree that it is too verbose. I followed the format that deeplearning4j
uses. If you're okay, I'll change it more concise.
Thank you!
By the way, could you do the following command? When I give 100 as an interation number, the job is terminated in FORCE_CLOSED
by signal: SIG_TERMINATE
.
./run_nn.sh -local true -maxIter 100 -conf sample_nn_conf -input sample_nn_data
@dongjoon-hyun, I think that's because of timeout. Please increase timeout value and try it again. :)
@dongjoon-hyun, I change the resource file in dnn_run_script
. Please take a look!
Thank you for your comment.
I'd like to change the default delimiter of DNN input file from ' '
(3 whitespaces) to ,
(comma) like the resource file in dnn_run_script
. Please let me know if you're okay.
@beomyeol , thank you for timeout
. And, for the default delimiter, it's okay.
By the way, is it possible to rebase dnn_run_script
, too?
Hi, @beomyeol . Could you change maxIter
as 100 instead of 10?
For algorithm, the accuracy goes about 82% (with iteration 100, stepsize: 1e-3). I think it's okay for now.
Also, Data looks good. It's evenly sampled. (Thank you again.)
$ awk -F, '{print $785}' sample_nn_data | sort | uniq -c
17
109 0
114 1
114 2
113 3
108 4
96 5
119 6
118 7
104 8
105 9
I'm finished my review for this issue! @jsjason , could you review this too?
Yes, I'd like to do a pass too.
Thank you! @jsjason .
@dongjoon-hyun, I've changed the default delimiter to ,
, maxIter
to 100, and stepsize
to 1e-3.
I've also added timeout
option for EXAMPLE USAGE
in run_nn.sh
.
In addition, it is possible to rebase dnn_run_script
. @jsjason, I'll rebase it if you're okay. Please let me know.
Thank you, @beomyeol . LGTM!
@dongjoon-hyun @beomyeol What do you mean by 'rebase dnn_run_script
'? This branch doesn't have any merge conflicts with master. Did you mean 'include the commits of dnn_run_script
here', perhaps?
@jsjason, I addressed some comments that you left. I left comments about the parts I haven't changed them yet.
For 'rebase', what I meant is what you said.
It seems a few comments from each of us have gone unnoticed.
Implementing an autoencoder seems great, but I'd prefer we add a new PR instead of replacing this one. The issues addressed by this PR has itstheir own good values.
The branch dnn_run_script
looks ready to be merged with this PR, if @beomyeol and @dongjoon-hyun are both okay with that.
Please merge! :)
I am also okay for merging dnn_run_script
. :)
@jsjason, @dongjoon-hyun. I included all the commits in dnn_run_script
and addressed the comments.
For you information, maven-protoc-plugin
is not available through Maven Central (https://github.com/sergei-ivanov/maven-protoc-plugin/blob/master/README.md). So I added the plugin repository for this.
Sorry, but the side effect is not proper for me. @jsjason , should we use another repository for just that plugin? I think the way of REEF is reasonable.
Moreover, I met the following error message with build error.
[ERROR] Plugin com.google.protobuf.tools:maven-protoc-plugin:0.4.2 or one of its dependencies could not be resolved: Failure to find com.google.protobuf.tools:maven-protoc-plugin:jar:0.4.2 in http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/ was cached in the local repository, resolution will not be reattempted until the update interval of protoc-plugin has elapsed or updates are forced -> [Help 1]
Could you move bin/sample_nn_data
into src/test/resources/data/
? In the master branch, all data resouce files are moved there already.
@beomyeol, please double-check that the build and tests succeed. I didn't know the plugin wasn't on maven, though I think it wouldn't hurt to have an extra dependency. After all, using maven central is a gigantic dependency by itself. Since @beomyeol is the one who did the work, I think we should ask his opinion too. @beomyeol, do you prefer doing a little more debugging, or simply reverting that last commit? And @dongjoon-hyun, please let us know if you have any more problems other than the build failure.
@jsjason . I told repository
, not dependency. It has a huge difference.
Moreover, I hope dolphin
has the consistency with REEF in various perspective for developers. In fact, I can not find any benefit to add extra dependency. Why should we be different from REEF?
Sorry, but I cannot agree with the way of adding extra repository just in order to clean-up ant
legacy.
@dongjoon-hyun. Sorry for your inconvenience. I didn't have build problem that you experienced. I'll find the way to fix it. In addition, I'll move sample_nn_data
to src/test/resource/data/
@jsjason. It is not problem to do debugging or reverting the commit. Acutally, I hesitated before using maven-protoc-plugin
because of the extra plugin repository. I prefer not using extra dependencies since they are not authorized.
Hi, @beomyeol . When I run mvn protoc:compile
, it turns out that it fails to download that that from github.com. Did you clone that github and build locally?
$ mvn protoc:compile
[INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for edu.snu.reef:dolphin:jar:0.1-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.codehaus.mojo:build-helper-maven-plugin is missing. @ line 252, column 15
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
[WARNING] The POM for com.google.protobuf.tools:maven-protoc-plugin:jar:0.4.2 is missing, no dependency information available
[WARNING] Failed to retrieve plugin descriptor for com.google.protobuf.tools:maven-protoc-plugin:0.4.2: Plugin com.google.protobuf.tools:maven-protoc-plugin:0.4.2 or one of its dependencies could not be resolved: Failure to find com.google.protobuf.tools:maven-protoc-plugin:jar:0.4.2 in http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/ was cached in the local repository, resolution will not be reattempted until the update interval of protoc-plugin has elapsed or updates are forced
Downloading: http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/org/codehaus/mojo/maven-metadata.xml
Downloading: http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/org/apache/maven/plugins/maven-metadata.xml
Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-metadata.xml
Downloading: https://repo.maven.apache.org/maven2/org/codehaus/mojo/maven-metadata.xml
Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-metadata.xml (13 KB at 9.5 KB/sec)
Downloaded: https://repo.maven.apache.org/maven2/org/codehaus/mojo/maven-metadata.xml (20 KB at 11.3 KB/sec)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3.322 s
[INFO] Finished at: 2015-08-29T16:35:58+09:00
[INFO] Final Memory: 11M/245M
[INFO] ------------------------------------------------------------------------
[ERROR] No plugin found for prefix 'protoc' in the current project and in the plugin groups [org.apache.maven.plugins, org.codehaus.mojo] available from the repositories [local (/Users/dongjoon/.m2/repository), protoc-plugin (http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/), central (https://repo.maven.apache.org/maven2)] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/NoPluginFoundForPrefixException
@dongjoon-hyun. No, I didn't. Could you change the part of pom.xml
as following and try it again?
Before
<pluginRepository>
<id>protoc-plugin</id>
<url>http://sergei-ivanov.github.com/maven-protoc-plugin/repo/releases/</url>
</pluginRepository>
To
<pluginRepository>
<id>protoc-plugin</id>
<url>https://dl.bintray.com/sergei-ivanov/maven/</url>
</pluginRepository>
Sorry for you inconvenience again.
Thank you, @beomyeol . It works now. I also checked in my Mac and in Ubuntu. It's time to decide, @jsjason . I'll follow your decision anyway since it's a project owned by SNU.
Two out of three prefer using ant over the new plugin, so let's just go with ant. Sorry @beomyeol, but it looks like you'll have to revert the commit 5866d66.
Sorry and thank you, @beomyeol and @jsjason .
@beomyeol @jsjason @dongjoon-hyun Thanks for resolving the protoc issue. I was about to ask to be consistent with REEF. :)
Feel free to merge it once pom.xml
is changed.
@bgchun @dongjoon-hyun @jsjason. I reverted the commit 5866d66 that is the modification of maven-protoc-plugin.
@dongjoon-hyun. I moved sample_nn_data
to src/test/resources/data
and sample_nn_conf
to src/test/resources/configuration
. Also, I renamed the run script, data, configuration file to use neuralnet
instead of nn
. I think neuralnet
is more clear. Please take a look at this :)
Thank you, @beomyeol . I finished review. Test passed and everything looks good except the above one line.
@dongjoon-hyun. Thank you. I addressed you comment.
@jsjason , could you review and merge this then?
I didn't get a chance to review the new parts, @dongjoon-hyun.
Sorry, @jsjason . But It's just a roll-back to your reviewed version. I thought you agreeed because you're quiet 15 hours after @bgchun says 'Feel free to merge it once pom.xml is changed.'
Last but not least, I hesitated to text you on sunday
. Anyway, you can revert any change if you want.
I didn't delete the branch feature-83-dnn_conf
.
I've implemented the feature that a neural network model loads the configuration from a file. I assumed that the configuration file is stored in Google protocol buffer text format. The protocol buffer message definition is located in
src/main/proto/neural_network.proto
. I used the configuration file architecture and variable names that are similar toCaffe
's ones.Here's the example of the neural network configuration that uses all the options which are implemented until now.
Closes #83.