Open mtamburrano opened 3 years ago
From version 20200916, fp16 storage and arithmetic are enabled by default on supported platforms. You can disable them via
net.opt.use_fp16_storage = false;
net.opt.use_fp16_arithmetic = false;
before loading your model param
thank you @nihui, I tried but unfortunately I've got the same results.
Anyway as I said is the commit https://github.com/Tencent/ncnn/commit/d04c05651f1dda38d0330fc6b7334b8e47cd1246 to break things and I cannot see anything there related to fp16 storage and arithmetics.
That commit changes something in the interp
layer, as far as I know it is also used in the resize
functions.
So probably I identified the source of the issue, and it is in my preprocessing code, not in the forward pass. More exactly, my code looks like:
ncnn::Mat input;
input.create(24, 24, image.c * 5, image.elemsize);
...
ncnn::Mat channels = input.channel_range(image.c * i, image.c);
ncnn::resize_bilinear(pad, channels, 24, 24);
...
Here we are doing some stuff to fill the input
Mat.
Using the old release, input
was filled correctly, but with the new version this doesn't seem the case anymore, input
is left untouched.
I'm still investigating why...
I can confirm that the issue is what I reported. This patch:
ncnn::Mat channels = input.channel_range(image.c * i, image.c);
ncnn::resize_bilinear(pad, channels, 24, 24);
for(int c = 0; c < channels.c; ++c) {
float* ptr = input.channel(image.c * i + c);
for(int w = 0; w < 24; ++w){
for(int h = 0; h < 24; ++h){
*(ptr + (24*w) + h) = channels.channel(c).row(w)[h];
}
}
}
fixes everything (I'm sorry, I'm sure that there is a better "nncnonic" way to fill the Mat but I'm pretty new to it and I'm not sure how to properly do it).
So, for some reason, resize_bilinear
fills the Mat channels
, but it is not treated as a reference to a portion of input
anymore, like it was in the past with the old library.
here: https://github.com/Tencent/ncnn/blob/34bd5ef16151417da32e1c258f5c55cd8501434e/src/layer/interp.cpp#L447
the address of the data
pointer of the Mat is being changed, that's why if we pass to this function a reference of another mat, the first mat is not updated.
Also I see some memory leak risk here.
Is this the expected behavior? It didn't worked like that in the past
I use since a while an MTCNN network to extract faces and landmarks from images. The original model is converted from CAFFE without issues with caffe2ncnn tool.
Now, I updated ncnn and I have 2 issues:
So I tried to figure out what is happening and I managed to find the first working (for my case) release. I found that the release
20200727
returns the values I were used to get (so my tests are green again). From release20200916
onwards I get this different behavior. So I built the library against almost every commit between the 2 releases and I found out that the "offending" commit is this one: https://github.com/Tencent/ncnn/commit/d04c05651f1dda38d0330fc6b7334b8e47cd1246It looks like it is something related to the
interp
layer. The interesting part is that in my converted model there is not such a layer, at least I cannot find it in thencnn.param
file.So, do you have any idea what is happening here? Thank you in advance