Closed caleb221 closed 4 years ago
I think my data wasn't being normalized. tried applying this function before sending it into convolutions and I think it fixed the problem
dl_matrix3d_t normalizeData(dl_matrix3du_t in,int w , int h) { dl_matrix3d_t out = dl_matrix3d_alloc(1,w,h,in->c); int n=hw; for (int i=0; i<n;i++) { out->item[i]=(in->item[i]-128)/128; } return out; }
Ahhh, I notice that you are using global_pooling function, which is to squeeze the feature map to 1x1xC dimension. Unfortunately, we have not yet provided that general pooling funtion as we can use stride in convolution to reduce the feature map. We will add that but if you are urgent to use that, I'm afraid you would write the funtion yourself.
BTW, in the code, you don't need to allocate output for each layer since you get the result by calling dl_matrix3dff_conv_3x3
, you'll get the output. So just declare the pointer will be fine.
Oh, thats good to know. I am in a very time constrained situation so i think i will have to implement it myself. If i understand correctly global pooling is a summation of all values in each channel to output a single kernel for each three channel array? Oh, thanks that will definitely be much easier on the eyes for the future. As a side question, is it valid to split a convolution into two parts and join them back together at the end in order to avoid a watchdog timeout?
using something similar to this method https://en.wikipedia.org/wiki/Overlap%E2%80%93add_method
A very simple understanding of global pooling is to perform (1, w, h, c) --> (1, 1, 1, c).
As a side question, is it valid to split a convolution into two parts and join them back together at the end in order to avoid a watchdog timeout?
To avoid long time execution, you can split your kernel into certain parts, i.e. origin kernel: (n, w, h, c), splited to e.g 4, (1st of n/4, w, h, c), (2nd of n/4, w, h, c), etc. You will get the output like (1, w', h', 1st of n/4), (1, w', h', 2nd of n/4), etc. Then concat the output. Or just write your result in certain position of final output, which is (1, w', h', n).
as a sanity check, a global pool should take something like the following form, correct? func: maxpool -> = for loops using: (n,w,h,c)
sorry github messed up the format
if its unclear as to what i mean here is code for this initial guess and one from an adaptation of caffe's pool.h (actually, both use the same equations, I am mostly asking about the order in which the data should be ordered in is the input matrix (dl_matrix3d_t) and out is defined as dl_matrix3d_alloc(1,1,1,in->c); Initial guess:
Caffe adaptation:
this seems to be permuting the whole filter as desired, but Im unsure if its completely correct (for some reason also I had to add abs() around in->stride, w, h (they were negative in the input matrix, not sure if bug or it means that im traversing the matrix upside down)
Hi @caleb221 , please update this repo, the updated funtion may help you. https://github.com/espressif/esp-face/blob/b94b0ab005c253937001327cceb9d83d41c49a88/lib/include/dl_lib_matrix3d.h#L315-L333
Oh great thank you! By the way, does this library have a reshape function?
I have a matrix of shape (1,64,192,4) being returned by the dl_matrix3d_concat_4() function ( i need to load the weights 1/4 at a time to avoid stack overflow/watchdog timers) but i need it in the shape (1,64,768,1) for a fully connected layer
an example of working code: (compiles and runs within an o net implementation) (getonet_cutX_fc_0(); is a function that fills the input matrix->items with the desired weights)
dl_matrix3d_t *fc1 = dl_matrix3d_alloc(1,64,192,1);
dl_matrix3d_t *fc2 = dl_matrix3d_alloc(1,64,192,1);
dl_matrix3d_t *fc3 = dl_matrix3d_alloc(1,64,192,1);
dl_matrix3d_t *fc4 = dl_matrix3d_alloc(1,64,192,1);
getonet_cut0_fc_0(fc1);
getonet_cut1_fc_0(fc2);
getonet_cut2_fc_0(fc3);
getonet_cut3_fc_0(fc4);
dl_matrix3d_t *fc_in;//=dl_matrix3d_alloc(1,64,768,1);
fc_in = dl_matrix3d_concat_4(fc1,fc2,fc3,fc4);
dl_matrix3d_free(fc1);
dl_matrix3d_free(fc2);
dl_matrix3d_free(fc3);
dl_matrix3d_free(fc4);
out: FC IN -> N: 1 C: 4 W: 64 H: 192 (1,64,192,4) I am going to take a guess and say I can reassign the indices to (1,64,768,1) but I am unsure if the order will stay correct.
Yes, if your data is for fc is 'nhwc', there is no difference with shape between (1, 64, 192, 4) and (1, 64, 768, 1).
Hello! I would like to know the way numbers are handled in the deep learning softmax function. After performing 3 convolutions I have output numbers for the first 1000 spaces (printed out). The shape of this convolution input is (32,3,3,16). then prelu, and maxpool. The following operation is another convolution of shape (2,1,1,32) to obtain the scores. This output for this matrix is mostly Nan, and then some numbers that make general sense for a score (0-1)...so my question would be where are these nans coming from? Are they expected? Should i write a small function to obtain the scores at each stride instead of all score for all convolutions? Could it be that my Data is not normalized? --> As a side note, I have a hunch that it could be something to do with weight ordering the ESP lib is: (N H W C) caffe's output is: (N C H W)
in-> (N C H W)
I have translated these formats using the method below with numpy first step: np.moveaxis(in,1,3) out-> (N W H C ) second step: np.moveaxis(in, 1,2) out-> ( N H W C) flatten the array using np.reshape(in,-1) load the array into dl_matrix3d_t using a for loop (using out = in.transpose(0,2,3,1) also provides the same result)
as an example (this is from the input to the bounding box layer) original Shape from caffe (4, 32, 1, 1) new shape for ESP lib(4, 1, 1, 32) flatten and load
OUTPUT FROM MODEL:
here is the code for the neural layers of the network: ( i have erased the free and some printout methods to save space, but will attach the whole code if context is needed)
getpnet_conv1_0(filt_c1);//pnetVals.h getpnet_conv1_1(bias1);//pnetvals.h getpnet_prelu1_0(prelu1);//pnetvals.h // 3x3 convolution 1 out1= dl_matrix3dff_conv_3x3(in,filt_c1,bias1,1,1,PADDING_SAME);
score_out = dl_matrix3dff_conv_3x3(out3,score_filter,score_bias,1,1,PADDING_SAME);
bbox_out = dl_matrix3dff_conv_3x3(out3,bbox_filter,bbox_bias,1,1,PADDING_SAME); dl_matrix3d_free(out3); dl_matrix3d_free(bbox_filter); dl_matrix3d_free(bbox_bias); printf("\n\nALLOCATING OUTPUT...\n\n\n");
//========================================= // SET MEMORY //
// dl_matrix3d_free(score_out);
printf("\n\n"); int i; printf("\n\nFROM PNET LOOP:\nscore: "); for (i=0;i<1500;i++) { printf("---:%f :---", score_out->item[i]); } printf("\n\n");
xSemaphoreGive(skeletonKey); vTaskDelay(1000/portTICK_PERIOD_MS); vTaskDelete( NULL );