snuspl / dolphin

14 stars 2 forks source link

Define file format for neural network configuration file and implement a parser for it. #83

Closed beomyeol closed 9 years ago

beomyeol commented 9 years ago

Since the neural network configuration is hard-coded now, we need to provide a method which allows users to configure their own neural network. To do this, we need to define file format for a neural network configuration. A user can configure a neural network model and store it in the file format we define. Our DNN system will load and parse this file, and build a neural network by this configuration. The file format must be human-readable such as JSON and XML.

dongjoon-hyun commented 9 years ago

For your information, may I introduce you the Caffe prototxt file format, briefly? Actually, it's a protocol buffer text format. The followings are the example code.

net: "models/bvlc_reference_caffenet/train_val.prototxt"
test_iter: 1000
test_interval: 1000
base_lr: 0.01
lr_policy: "step"
gamma: 0.1
stepsize: 100000
display: 20
max_iter: 450000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "models/bvlc_reference_caffenet/caffenet_train"
solver_mode: GPU
name: "CaffeNet"
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true
    crop_size: 227
    mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
  }
  data_param {
    source: "examples/imagenet/ilsvrc12_train_lmdb"
    batch_size: 256
    backend: LMDB
  }
}
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mirror: false
    crop_size: 227
    mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
  }
  data_param {
    source: "examples/imagenet/ilsvrc12_val_lmdb"
    batch_size: 50
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
...

Of course, you need not to use them. I'm not pushing you. I think small JSON file format is enough for SNU. However, please keep it mind that @swlsw and I should make a Caffe model converter for Dolphin. If possible, please make the structure and variable names similarly.

beomyeol commented 9 years ago

@dongjoon-hyun, Thank you for your introduction of Caffe prototxt file format. Me and @jsjason were considering small JSON file format. We need more discussion for this. But, I think we will make the structure and variable names similar to Caffe's ones.

beomyeol commented 9 years ago

@dongjoon-hyun, we've decide to use prototxt like Caffe. We are gonna try to make the structure and variable names similar to Caffe's ones. But, there might be some difference because of the difference between Caffe and ours.

dongjoon-hyun commented 9 years ago

What is that? I'm just curious.

beomyeol commented 9 years ago

One example is applying activation function. In Caffe, Activation function is applied by adding activation function layer such as TanHLayer and SigmoidLayer. However, in our DNN and DL4J, activation function is applied at each layer, not adding another layer.

Here's the neural network configuration example that I am planning to use.

batch_size: 10
step_size: 1e-2
parameter_provider {
  type: "local"
}
layer {
  type: "FullyConnected"
  num_input: 784
  num_output: 50
  fully_connected_param {
    init_weight: 1e-4
    init_bias: 2e-4
    activation_function: "sigmoid"
  }
}
layer {
  type: "FullyConnected"
  num_input: 50
  num_output: 10
  fully_connected_param {
    init_weight: 1e-2
    init_bias: 2e-2
    activation_function: "sigmoid"
  }
}
dongjoon-hyun commented 9 years ago

Yep, I know the differance. It's okay. Your file format is better and more consice.