Closed fengyuentau closed 1 week ago
It is even weirder that if I set unrolling by factor of 2 instead of 4, it passes. Could it be something related to the compiler?
#if CV_ENABLE_UNROLLED
for(; j <= m - 2; j += 2 )
{
WT t0 = d_buf[j] + WT(b_data[j])*al;
WT t1 = d_buf[j+1] + WT(b_data[j+1])*al;
d_buf[j] = t0;
d_buf[j+1] = t1;
// t0 = d_buf[j+2] + WT(b_data[j+2])*al;
// t1 = d_buf[j+3] + WT(b_data[j+3])*al;
// d_buf[j+2] = t0;
// d_buf[j+3] = t1;
}
#endif
Are there some aliased pointers on the same memory for reading and writing? (inplace processing)
Are there some aliased pointers on the same memory for reading and writing? (inplace processing)
Probably no.
d_buf
is an AutoBuffer allocated in the domain of outside of the loop. It is set to zero for each element just before entering the problematic loop.b_data
is a bit complecated. It is a parameter for the function and for test case 14, it is a piece of buffer copied from the original source B mat.Test is now green with this patch merged https://github.com/opencv/ci-gha-workflow/actions/runs/8750864630/job/24015266117?pr=171
Resolves https://github.com/opencv/opencv/issues/25302
Reproducer: https://github.com/opencv/ci-gha-workflow/actions/runs/8747714722/job/24006610667?pr=171#step:12:1041
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request