orzzzjq / Parallel-Banding-Algorithm-plus

Compute the exact Euclidean Distance Transform and Voronoi Diagram for 2D and 3D binary images using the GPU.
https://www.comp.nus.edu.sg/~tants/pba.html
MIT License
73 stars 8 forks source link

The result for the first row is incorrect #7

Closed y123l closed 2 months ago

y123l commented 2 months ago

The size of the sample image is 64 x 64, and the pixel is set as follows: for(int i=0; i<pixels; i++) { if(i%7) pSrcHost[i] = 0x01; else pSrcHost[i] = 0x00; }

image

y123l commented 2 months ago

Continue with the previous comment, pixel with value of 0 is set to site. The left side of the image is the correct result of Voronoi, and the right side is the result of PBA. The data stored is of the short type, for example, count from 0, the 3th pixel' correct Voronoi data is (5, 2) [The square of Euclidean distance is 8], but the PBA result is (0, 0) [The square of Euclidean distance is 9]. So, what is the cause of this error. Looking forward to your reply, thanks!

orzzzjq commented 2 months ago

Cannot really understand the picture. The size 64 x 64 is probably too small for PBA. Could you let me know the parameters you used: https://github.com/orzzzjq/Parallel-Banding-Algorithm-plus/blob/2f58720302c3478d4367b21268dfbb263b46c4d2/pba-plus-2D/main.cpp#L49-L55

y123l commented 2 months ago

Thanks! The above problem was solved, but I ran into a new problem, when I set the width and height to 2048, phase1Band = phase2Band = 32, phase3Band = 2. The program is stuck in an infinite loop in function 'kernelColor' ` while (last2.y >= 0) { dx = last1.x - tx; dy = last2.y - ty; dist = mul24(dx, dx) + mul24(dy, dy);

            if (dist > best) 
                break; 

            best = dist; lasty = last2.y; last2 = last1;

            if (last2.y >= 0) 
                last1 = input[TOID(tx, last2.y, size)];
        }`
orzzzjq commented 2 months ago

2048 is the default example, which should provide a correct answer. Please let me know how you initialize your input https://github.com/orzzzjq/Parallel-Banding-Algorithm-plus/blob/2f58720302c3478d4367b21268dfbb263b46c4d2/pba-plus-2D/main.cpp#L103-L104

y123l commented 2 months ago

template void transform2Dtexture(T pSrc, int nSrcStep, T nMinSiteValue, T nMaxSiteValue, short2 input, int width, int height, cudaStream_t stream) { dim3 blockSize(16, 16); dim3 gridSize((width + blockSize.x - 1) / blockSize.x, (height + blockSize.y - 1) / blockSize.y);

transformKernel<<<gridSize, blockSize, 0, stream>>>(pSrc, nSrcStep, nMinSiteValue, nMaxSiteValue, input, width, height);

}

template global void transformKernel(T pSrc, int nSrcStep, T nMinSiteValue, T nMaxSiteValue, short2 input, int width, int height) { int x = blockIdx.x blockDim.x + threadIdx.x; int y = blockIdx.y blockDim.y + threadIdx.y;

if (x < width && y < height) {
    T value = pSrc[y * nSrcStep + x];
    short2 markerPair = make_short2(MARKER, MARKER);

    if (value >= nMinSiteValue && value <= nMaxSiteValue) {
        input[y * width + x] = make_short2(x, y);
    } else {
        input[y * width + x] = markerPair;
    }
}

}

y123l commented 2 months ago

Be more specific , the program will stuck in an infinite loop with width >= 256 and it will perform normally with width <=128. (phase1Band = phase2Band = width/64; phase3Band = 2)

orzzzjq commented 2 months ago

Can you try to initialize your input with CPU to see if there is something wrong with the input? It is difficult for me to understand the code without any comments.

y123l commented 2 months ago

solved! many thanks!

orzzzjq commented 2 months ago

Welcome :)