Open Yingxiu-Chang opened 4 years ago
conv
, *scale
, +bias
scale
= K*alpha
also there is +bias
Got it. So there is +bias after ⊙α, and finally use batch_normalization, right?
@AlexeyAB Got it. So there is +bias after ⊙α, and finally use batch_normalization, right?
@AlexeyAB Hello Sir, When I was reading your darknet code, especially in convolutional_kernels.cu. The comments between line 592 and line 598 are
if (l.batch_normalize) {
forward_batchnorm_layer_gpu(l, state);
}
else {
add_bias_gpu(l.output_gpu, l.biases_gpu, l.batch, l.n, l.out_w*l.out_h);
}
#endif
which means that we can only use BN or bias separately.
I use BN for all conv-layers (except conv with activation=linear)
@AlexeyAB Hello Sir, I confuse about the function of float_to_bit in gemm.c, especially between line 1789 and line 1811 as below.
void float_to_bit(float *src, unsigned char *dst, size_t size)
{
size_t dst_size = size / 8 + 1;
memset(dst, 0, dst_size);
size_t i;
//__m256i all256_sing1 = _mm256_set_epi32(0x80000000, 0x80000000, 0x80000000, 0x80000000, 0x80000000, 0x80000000, 0x80000000, 0x80000000);
__m256 float_zero256 = _mm256_set1_ps(0.0);
for (i = 0; i < size; i+=8)
{
//__m256i src256 = _mm256_loadu_si256((__m256i *)(&src[i]));
//__m256i result256 = _mm256_and_si256(src256, all256_sing1); // check sign in 8 x 32-bit floats
//uint32_t mask = _mm256_movemask_ps(_mm256_castsi256_ps(result256)); // (val >= 0) ? 0 : 1
////mask = ~mask; // inverse mask, (val >= 0) ? 1 : 0
__m256 src256 = _mm256_loadu_ps((float *)(&src[i]));
__m256 result256 = _mm256_cmp_ps(src256, float_zero256, _CMP_GT_OS);
uint32_t mask = _mm256_movemask_ps(result256); // (val > 0) ? 0 : 1
dst[i / 8] = mask;
}
}
The part that I don't get it is the function of __m256 src256 = _mm256_loadu_ps((float *)(&src[i]));
Each src[i] is a 32-bit float points element. However, src256 is a 256-bit element. I don't know how to change a 32-bit element to a 256-bit element.
Thank you so much for your answer.
@AlexeyAB Hello Sir, I'm working on your yolov3-tiny_xnor model and facing some confusions.
I∗W≈(sign(I)⊛sign(W))⊙K⊙α
.bin_output=1
. I'm not sure where you use it. Just like the formula in question 1, did you put it before ⊙K or after ⊙α?