Open META-DREAMER opened 6 years ago
I have also working on implementing tiny-yolo voc on my de1soc.
@SmartRoof Have you made any progress?
@hammadj we have it running, but its slower than we expected.
@johnnydept How did you setup the layer_config.h
? And what did you do for your weights file? I converted the tiny-yolo-voc.(cfg/weights)
to caffecaffemodel
and prototxt
files, and then took that and merged the batch-norm layers into the conv layers and then finally used the Matlab script to convert the result into a weights.dat file. Did you do the same? Also, how is the performance and what board are you using?
Here is where I am so far for the layer_config. Does this look okay?
// TINY YOLO CONFIGURATION
unsigned layer_config[][NUM_CONFIG_ITEM] = {
{ // Layer1
// layer_type (conv = 0, fc = 1)
0,
//data_w, data_h, data_n, weight_w, weight_h, weight_n, weight_m, bias_size
416, 416, 3, 3, 3, 3, 16, 16,
// memrd_src (0-> data_buf, 1-> output_buf)
0,
// conv_x, conv_y, conv_z, conv_stride, conv_padding, conv_split, conv_relu
416, 416, 16, 1, 1, 1, 1,
// pool_on, pool_x, pool_y, pool_z, pool_size, pool_stride,
1, 208, 208, 16, 2, 2,
// lrn control (on = 1, off = 0)
0,
// memwr_dst (0-> data_buf, 1-> output_buf "2"
1
},
{ // Layer 2
0,
208, 208, 16, 3, 3, 8, 32, 32,
1,
208, 208, 32, 1, 1, 1, 1,
1, 104, 104, 32, 2, 2,
0,
0
},
{ // Layer 3
0,
104, 104, 32, 3, 3, 8, 64, 64,
0,
104, 104, 64, 1, 1, 1, 1,
1, 52, 52, 64, 2, 2,
0,
1
},
{ // Layer 4
0,
52, 52, 64, 3, 3, 8, 128, 128,
1,
52, 52, 128, 1, 1, 1, 1,
1, 26, 26, 128, 2, 2,
0,
0
},
{ // Layer 5
0,
26, 26, 128, 3, 3, 8, 256, 256,
0,
26, 26, 256, 1, 1, 1, 1,
1, 13, 13, 256, 2, 2,
0,
1
},
{ // Layer 6
0,
13, 13, 256, 3, 3, 8, 512, 512,
1,
13, 13, 512, 1, 1, 1, 1,
1, 13, 13, 512, 2, 1,
0,
0
},
{ // Layer 7
0,
13, 13, 512, 3, 3, 8, 1024, 1024,
0,
13, 13, 1024, 1, 1, 1, 1,
0, 13, 13, 1024, 2, 1,
0,
1
},
{ // Layer 8
0,
13, 13, 1024, 3, 3, 8, 1024, 1024,
1,
13, 13, 1024, 1, 1, 1, 1,
0, 13, 13, 1024, 2, 1,
0,
0
},
{ // Layer 9
0,
13, 13, 1024, 1, 1, 8, 125, 125,
0,
13, 13, 125, 1, 0, 1, 0,
0, 13, 13, 125, 2, 1,
0,
1
},
};
signed char precision_config[][3] ={
{8, 0, -4},//Layer-1
{ 8, 0, -2},//Layer-2
{ 8, 0, -1},//Layer-3
{ 8, -1, -1},//Layer-4
{ 8, -1, -1},//Layer-5
{8, -1, 0},//Layer-6
{8, 0, 2},//Layer-7
{8, 2, 2},//Layer-8
{8, 2, 2}//Layer-9
};
unsigned input_config[4] = {416, 416, 3, 1}; //original image size(dim1, dim2, dim3), batch size
unsigned output_config[3] = {13, 13, 125};//Layer-8 Note: only one result is extracted and verified
I've been getting errors about the pooling on layer 6 (Error: incorrect setting of pooling input/output size for layer-6!!!
). If I disable pooling on layer 6 it start running, but then hangs while Launching kernel MemWr with local size...
. I am testing this in sw_emu
btw.
Here's my setup in main.cpp:
#define IMAGE_FILE_SIZE (416*416*3)
#define WEIGHTS_FILE_SIZE 15730592
#define LAYER_NUM 9
#define CONV_NUM 9
const char *weight_file_path = "./data/yolo/weights.dat";
const char *input_file_path = "./data/yolo/dog.dat";
And here is my weights file, the caffe model, the matlab script to generate weights, as well as the input file: tiny-yolo-config.zip
@doonny Do you have any idea where I could be going wrong?
Tiny-yolo uses SAME padding in max pool, meaning stride 1 in layer 6 will output a similiar size as input. For that, i think u have to manually add some padding. https://stackoverflow.com/a/48393040/1558037
@johnnydept Im still having troubles getting it to run. Can you share your layer_config/weights you used?
@hammadj The thing is I uses floating point implementation, though, fixed point is my next work plan or using coco dataset. We have it running on de1soc at 8s/image
@johnnydept Do you know what could be causing a hang? Its stuck on the clWaitForEvents
call on layer 1. So padding on layer 6 shouldnt even matter at this point since its only layer 1.
Debugged a bit more, found the place where it is hanging, its in the memWrite
function in conv_pipe.cl
, the line that says output = read_channel_intel(pool_ch);
. It hangs when (x=112, y=61) for some reason.
@aazz44ss Would you have any idea whats wrong here?
@johnnydept @doonny Ok so I finally got tinyYOLO running, the problem was that the uchar
type used in many places doesn't support values higher than 256, so I switched those out for ushort
and its running now.
However, the output I get is not as expected. I've attached the result dump here: result_dump.txt.
I feel like its because I need to setup the precision config properly for tinyYOLO. Any idea on what the proper precision_config should be for tinyYOLO? Do I need to change anything with how I am converting the weights? This is my matlab script for converting weights right now:
caffe.set_mode_cpu();
model = './caffe/tiny-yolo-nobn.prototxt';
weights = './caffe/tiny-yolo-nobn.caffemodel';
net = caffe.Net(model, weights, 'test');
netparams = {{net.params('conv1',1).get_data(),net.params('conv1',2).get_data()}, ...
{net.params('conv2',1).get_data(),net.params('conv2',2).get_data()}, ...
{net.params('conv3',1).get_data(),net.params('conv3',2).get_data()}, ...
{net.params('conv4',1).get_data(),net.params('conv4',2).get_data()}, ...
{net.params('conv5',1).get_data(),net.params('conv5',2).get_data()}, ...
{net.params('conv6',1).get_data(),net.params('conv6',2).get_data()}, ...
{net.params('conv7',1).get_data(),net.params('conv7',2).get_data()}, ...
{net.params('conv8',1).get_data(),net.params('conv8',2).get_data()}, ...
{net.params('conv9',1).get_data(),net.params('conv9',2).get_data()}};
WeightWidth = [ 8; 8; 8; 8; 8; 8; 8; 8; 8];
WeightFrac = [ 8; 8; 8; 8; 8; 8; 8; 8; 8];
MathType = fimath('RoundingMethod', 'Nearest', 'OverflowAction', 'Saturate', 'ProductMode', 'FullPrecision', 'SumMode', 'FullPrecision');
for i=1:9
WeightType{i} = numerictype('Signed',1, 'WordLength', WeightWidth(i), 'FractionLength', WeightFrac(i));
weight{i} = fi(netparams{i}{1}, WeightType{i}, MathType);
bias{i} = fi(netparams{i}{2}, WeightType{i}, MathType);
end
fid = fopen('weights.dat', 'w');
for i=1:9
fwrite(fid, storedInteger(weight{i}), 'int8');
fwrite(fid, storedInteger(bias{i}), 'int8');
end
fclose(fid);
@hammadj What do you mean by "merged the batch-norm layers into the conv layers", did you write a kernel that do BN after convolution? If so, are you willing to share the code? Thank you.
Heyy @hammadj @johnnydept @doonny I am also thinking of implementing yolov2 or tiny yolo on fpga using opencl. I am thinking of using Pipecnn as reference. To do this, which files to I need to change for this to work for yolo? I need to write the kernel files, layer config file and host files according to yolo ryt ? Is that all that I need to change ?
It would be a great help if you could help me with this cause I am still new to opencl. Thanks in advance!
@hammadj What do you mean by "merged the batch-norm layers into the conv layers", did you write a kernel that do BN after convolution? If so, are you willing to share the code? Thank you.
it refers to fused batch norm layer. When training is finished , the graph is frozen you can calculate normalization mean and etc and fuse them to weights. and after that there is no need to do batch norm in your inference.
For more information you can use tf lite quantization discripsion.
I am working on implementing tiny-YOLO using PipeCNN, was just looking for some advice and guidance for the best way to do it and the steps I should take.
I'm going to convert the tiny-YOLO weights file from darknet -> caffe and then use MATLAB fixed point toolbox to convert that to fixed-point weights that PipeCNN will use.
For updating the layer_config, how should I do this? What exactly is the format of the layer_config?
I will also be updating main.cpp to work with webcam feed.
Is there anything else here that I missed? What else will I need to do to get tiny-YOLO running?
We recently developed a CNN accelerator for darknet reference model which could be helpful for you to implement the tiny yolo. We used DE10 Nano based on Intel Cyclone V SoC FPGA for the implementation. You can check out the entire design flow to implement the accelerator and the relevant codes in this repository: Link
I am working on implementing tiny-YOLO using PipeCNN, was just looking for some advice and guidance for the best way to do it and the steps I should take.
I'm going to convert the tiny-YOLO weights file from darknet -> caffe and then use MATLAB fixed point toolbox to convert that to fixed-point weights that PipeCNN will use.
For updating the layer_config, how should I do this? What exactly is the format of the layer_config?
I will also be updating main.cpp to work with webcam feed.
Is there anything else here that I missed? What else will I need to do to get tiny-YOLO running?