Closed tomoaki0705 closed 3 years ago
Yes, a result of register comparison expected to be an integer register, so reinterpreting comparison as integer could also help. By the way, 0xff ff ff ff is NaN (to be correct, -NaN), not -Inf. Is it true that there'e an implicit conversion between different NaNs in calculations?
Thanks for the review @savuor
0xffffffff
is -NaN
, not -Inf
. Thanks for pointing.NaN
ends up in 0x7fc00000
unsigned int inputZero[] = {0, 0, 0, 0};
unsigned int inputNaN[] = {0xffffffff, 0x7fc00000, 0x7fc00001, 0x7fffffff};
unsigned int resultSIMD[4];
float32x4_t a = vld1q_f32((float*)inputZero);
float32x4_t b = vld1q_f32((float*)inputNaN);
vst1q_f32((float*)resultSIMD, vaddq_f32(a, b));
for(int i = 0;i < 4;i++)
{
std::cout << std::hex << "0x" << resultSIMD[i] << std::endl;
}
ends up in
0x7fc00000
0x7fc00000
0x7fc00000
0x7fc00000
Same code becomes as below in 64bit
0xffffffff
0x7fc00000
0x7fc00001
0x7fffffff
This behavior is exactly same using NEON or w/o NEON.
I think there was a rules for this NaN + 0
in IEEE 754, so I feel either platform is breaking this standard, but I didn't dig any deeper.
System information (version)
Detailed description
Test
opencv_test_rgbd
fails withSegmentation fault
I could see that this was NOT happening on Aarch64 platforms, but only on Arm 32bit platforms.
Tracing with GDB, the access violation was happening here
https://github.com/opencv/opencv_contrib/blob/0def4736191800fd7ab67550b7126dc2ca5871ef/modules/rgbd/src/tsdf.cpp#L267-L268
The index was sometime negative value, sometime larger than the size of
volData
Tracing back, the index was decided by the coordinateix
,iy
andiz
https://github.com/opencv/opencv_contrib/blob/0def4736191800fd7ab67550b7126dc2ca5871ef/modules/rgbd/src/tsdf.cpp#L256
And I could confirm that sometimes, one of them were near the boundary. This function is checking the difference between the neighbors, so the boundary has to be checked strictly. It was done at the beginning of the function
https://github.com/opencv/opencv_contrib/blob/0def4736191800fd7ab67550b7126dc2ca5871ef/modules/rgbd/src/tsdf.cpp#L234-L238
The comparison will return
0xffffffff
for true or0
for false. If an index is pointing outside of the boundary, addition between0xffffffff
and0
happens. This type isv_float32x4
so the addition happens in floating-point arithmetic, which is-Inf + 0
. This results in0xffffffff
on Aarch64 platforms, so takingv_check_any
which checks the sign bit of each element, will grab if the comparison was true or not, correctly. On Arm 32bit, the result of-Inf + 0
becomesNaN
, which is0x7fc00000
Takingv_check_any
of this will always end up in0
, regardless the result of comparison. As far as I can see, this behavior seems standard (expected) behavior on Arm 32bit platform.Now, changing the checking part as following let the test pass (i.e. taking
or
of eachv_check_any
, not takingv_check_any
ofadd
)Test on Raspberry Pi after the modification
I'll send a patch shortly
Steps to reproduce
Run
opencv_test_rgbd
on Arm 32bit platformIssue submission checklist