Closed beomyeol closed 9 years ago
For your information, may I introduce you the Caffe
prototxt file format, briefly? Actually, it's a protocol buffer text format. The followings are the example code.
solver.txt
net: "models/bvlc_reference_caffenet/train_val.prototxt"
test_iter: 1000
test_interval: 1000
base_lr: 0.01
lr_policy: "step"
gamma: 0.1
stepsize: 100000
display: 20
max_iter: 450000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "models/bvlc_reference_caffenet/caffenet_train"
solver_mode: GPU
name: "CaffeNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
data_param {
source: "examples/imagenet/ilsvrc12_train_lmdb"
batch_size: 256
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 227
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
data_param {
source: "examples/imagenet/ilsvrc12_val_lmdb"
batch_size: 50
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
...
Of course, you need not to use them. I'm not pushing you. I think small JSON file format is enough for SNU. However, please keep it mind that @swlsw and I should make a Caffe
model converter for Dolphin
. If possible, please make the structure and variable names similarly.
@dongjoon-hyun, Thank you for your introduction of Caffe
prototxt file format. Me and @jsjason were considering small JSON file format. We need more discussion for this. But, I think we will make the structure and variable names similar to Caffe
's ones.
@dongjoon-hyun, we've decide to use prototxt like Caffe
. We are gonna try to make the structure and variable names similar to Caffe
's ones. But, there might be some difference because of the difference between Caffe
and ours.
What is that? I'm just curious.
One example is applying activation function. In Caffe
, Activation function is applied by adding activation function layer such as TanHLayer and SigmoidLayer. However, in our DNN and DL4J, activation function is applied at each layer, not adding another layer.
Here's the neural network configuration example that I am planning to use.
batch_size: 10
step_size: 1e-2
parameter_provider {
type: "local"
}
layer {
type: "FullyConnected"
num_input: 784
num_output: 50
fully_connected_param {
init_weight: 1e-4
init_bias: 2e-4
activation_function: "sigmoid"
}
}
layer {
type: "FullyConnected"
num_input: 50
num_output: 10
fully_connected_param {
init_weight: 1e-2
init_bias: 2e-2
activation_function: "sigmoid"
}
}
Yep, I know the differance. It's okay. Your file format is better and more consice.
Since the neural network configuration is hard-coded now, we need to provide a method which allows users to configure their own neural network. To do this, we need to define file format for a neural network configuration. A user can configure a neural network model and store it in the file format we define. Our DNN system will load and parse this file, and build a neural network by this configuration. The file format must be human-readable such as JSON and XML.