kunzmi / ImageStackAlignator

Implementation of Google's Handheld Multi-Frame Super-Resolution algorithm (from Pixel 3 and Pixel 4 camera)
GNU General Public License v3.0
392 stars 65 forks source link

How does the super resolution works? #29

Open iKaHibi opened 2 years ago

iKaHibi commented 2 years ago

Hello Michael. I am reading your code recently and get confused about how the super resolution works. In file DeBayerKernels.cu, line 398 to 402 writes

float posX = ((float)x + 0.5f + dimX / 2) / 2.0f / dimX;
float posY = ((float)y + 0.5f + dimY / 2) / 2.0f / dimY;

float4 kernel = tex2D<float4>(kernelParam, posX, posY);// *(((const float3*)((const char*)kernelParam + (y / 2 + dimY / 4) * strideKernelParam)) + (x / 2 + dimX / 4));
float2 shift = tex2D<float2>(shifts, posX, posY);// *(((const float2*)((const char*)shifts + (y / 2 + dimY / 4) * strideShift)) + (x / 2 + dimX / 4));

I think this will result in pixels in neighborhood of 2*2 area of the final high resolution image get the same kernel and shift.

Similarly, line 414 to 423 writes

int ppsx = x + px + sx + dimX / 2;
int ppsy = y + py + sy + dimY / 2;
int ppx = x + px + dimX / 2;
int ppy = y + py + dimY / 2;

ppsx = min(max(ppsx/2, 0 + dimX / 4), dimX/2 - 1 + dimX / 4);
ppsy = min(max(ppsy/2, 0 + dimY / 4), dimY/2 - 1 + dimY / 4);

ppx = min(max(ppx / 2, 0 + dimX / 4), dimX / 2 - 1 + dimX / 4);
ppy = min(max(ppy / 2, 0 + dimY / 4), dimY / 2 - 1 + dimY / 4);

Making four pairs of (x, y) get the same (ppsx, ppsy) and (ppx, ppy). I think this will result in four pixels in the final image point to the same pixel in the input low resolution image.

As four pixels in neighbourhood of 2*2 area get the same kernel, shift, uncertaintymask and raw img data, I think this will make them have same value in the final image.

What point do I miss in your code to achieve the high resolution?

Besides, it seems values of the 4th channel of the variable _structureTensor4 are always zero, which instantiated at line 1839 in file ImageStackAlignatorController.cs. What is the difference between _structureTensor4 and _structureTensor ?

kunzmi commented 2 years ago

Hi,

the point is, that we use GPU texture-interpolation to get interpolated values for kernel and shift, they are not the same values for an area of 2*2 pixels.

This is also the reason for using _structureTensor4 in the super-resolution kernel variant: interpolation using textures is not possible for float3 datatype, hence we use a float4 and ignore the fourth part.

Cheers, Michael