Boda example - Githubissues

chaeseok commented 8 years ago

Is there an example? And how can I run it?

moskewcz commented 8 years ago

probably the first thing to run are the tests. assuming the octave tests are disabled, to run the remaining tests you'll need the pascal VOC 2007 dataset as well as various pre-trained fully-convolutional models. for the pascal data set, no particular setup is required other than to set the appropriate path in the config file. see the instructions in:

https://github.com/moskewcz/boda/blob/master/mwm_boda_cfg.xml

in summary, only the following paths in the config file need to be valid for testing (without octave support): boda_output_dir="." pascal_data_dir="/home/moskewcz/bench/VOCdevkit/VOC2007" caffe_dir="/home/moskewcz/git_work/caffe_dev" models_dir="/path/to/models"

for the models, one can train-from-scratch or obtain standard pretrained models for alexnet, nin, googlenet, and so on, and convert them to fully convolutional ones using the 'standard' net-surgery approach using caffe. alternately, boda includes a mode that converts models with FC layers into fully convolutional ones as well. in general, obtaining and preparing such nets is outside the scope of the boda framework, and such nets cannot in general be publicly redistributed. however, i believe all the nets need for the current boda tests happen to be redistributable, but i can't/shouldn't post a public link to my copies here due to bandwidth limitations. email me privately and we can work something out. i did commit at least the .prototxt files used by the tests to the repo, as it's obviously difficult to guess what versions of the various nets i'm using -- especially for those that i've modified for testing. many tests don't require 'valid' model files (i.e. for performance and comparison testing), so it would be possible to create random model files for these nets and still get reasonable test coverage. of course demos that require the nets to be valid wouldn't work in that case.

depending on the version of caffe, type of graphics card, versions of nvidia libraries, the particular model files used, and so on, the numerical results of some tests may differ from the results stored in the repo. however, as long as the results are 'okay' in terms of absolute error, this is acceptable. it's just a deficiency/limitation of the testing framework that manual effort is required to validate results in such cases.

also, due to limitations of the framework, currently all the other paths, even if unused by any to-be-run tests, must be set to some value (i.e. they cannot be ommited, but need not point to valid directories).

now, to actually run the tests, after setup is complete, add boda to the path, make a testing directory, and run the test_all mode there. the results should ideally like this:

moskewcz@maaya:~/git_work/boda/run/tr2$ boda test_all ; date 
WARNING: test_cmds: some modes had test commands that failed to initialize. perhaps these modes aren't enabled?
  FAILING MODES: oct_featpyra oct_resize run_dfc test_oct
TIMERS:  CNT     TOT_DUR      AVG_DUR    TAG  
           4      49.033s      12.258s    test_all_subtest
          32      49.027s       1.532s    test_cmds_cmd
          27      2.584ms      0.095ms    diff_command
          19     56.747ms      2.986ms    read_pascal_image_list_file
           2      4.774ms      2.387ms    read_results_file
           1      0.348ms      0.348ms    score_results_for_class
           4      0.043ms      0.010ms    read_text_file
          13      20.624s       1.586s    nvrtc_compile
        2550     14.235ms      0.005ms    cu_launch_and_sync
         638    637.069ms      0.998ms    caffe_copy_layer_blob_data
           1     62.138ms     62.138ms    caffe_init
          24      16.103s    670.960ms    caffe_create_net
         428    699.259ms      1.633ms    caffe_set_layer_blob_data
         737     41.386ms      0.056ms    img_copy_to
         870    406.546ms      0.467ms    subtract_mean_and_copy_img_to_batch
          20    156.467ms      7.823ms    dense_cnn
         677       3.132s      4.627ms    caffe_fwd_t::run_fwd
         677       2.751s      4.064ms    caffe_copy_output_blob_data
         588       2.066s      3.514ms    sparse_cnn
          60       3.143s     52.399ms    net_upsamp_cnn
          60     83.570ms      1.392ms    upsample_2x
          60    949.701ms     15.828ms    img_upsamp_cnn
          69       3.223s     46.723ms    conv_pipe_fwd_t::run_fwd
Tue Dec 15 18:19:47 PST 2015
moskewcz@maaya:~/git_work/boda/run/tr2$

after running the tests, there are a couple of example demo command you can try from the file: https://github.com/moskewcz/boda/blob/master/doc/demo_notes.txt

the first and third ones are a good starting place. note that classification demos expect the caffe ilsvrc auxiliary data to have been downloaded and placed in the usual location in the caffe tree (caffe_dir/data/ilsvrc12/synset_words.txt). in general, you should be able to run the similar (but better documented) caffe classification demos before trying to run the boda equivalents -- they have similar requirements in terms of what inputs are needed.

by modifying the command line for any test or demo command, you can change what backend is used for computation among caffe, boda-rtc (opencl), and boda-rtc (cuda). there are various test cases that exercise all three backends. in particular, see the list of full-command tests here:

https://github.com/moskewcz/boda/blob/master/test/test_cmds.xml

the tests with names such as 'test_compute_XXX' are good examples of using the various backends and setting the various code-generation/optimization options for the boda-rtc backend.

chaeseok commented 8 years ago

Some tests are failed mainly with error messages as below:

MAD FAILS num_mad_fail=1
syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory I found that the compute capability of your GPU is 5.2, but that of our's is 3.0. So I want to upgrade our GPU. Could you tell me which GPU do you use?

moskewcz commented 8 years ago

on my dev box i use a: Device 0: "GeForce GTX 980" but a titan-X (desktop) or M40 (server) would be a better choice if possible. however, any nvidia maxwell-based card with 4G+ memory should be reasonable.

kepler cards (like yours) might work as well, and tuning kernels for them might be interesting, but i haven't much explored it. while there are a lot of kepler cards out there, pascal is just around the corner so i'm not sure it makes sense to focus much effort on them -- at least for me personally anyway. i do occasionally run on kepler cards, such at K40s, but not for a while.

assuming your GPU has <4GB of memory, and that that's the problem for the the out-of-memory issues, they might be fixable by decreasing the batch size of the particular tests that fail. or you could perhaps ignore them.

for the MAD FAILS, i'd need to see the full output for the particular failing test to say anything. it's possible that the error is just above the tolerance value for that test, but that nothing's really wrong, or it could be a real bug/error. both are more likely since you're running on kepler instead of maxwell. if you're comparing against caffe, it also would depend on the configuration of caffe and which mode/library it's using for computation.

moskewcz / boda

Boda example #2