openpaperwork / libpillowfight

Small library containing various image processing algorithms (+ Python 3 bindings) that has almost no dependencies -- Moved to Gnome's Gitlab
https://gitlab.gnome.org/World/OpenPaperwork/libpillowfight
62 stars 13 forks source link

Test failures on alternate architectures #11

Open QuLogic opened 6 years ago

QuLogic commented 6 years ago

As noted in #2, tests fail on alternate arches. Though #10 has fixed some of them, there are still a few problems which I don't really know how to fix as I'm not really familiar with the algorithms. Below are the results and diffs that might be clearer to someone with more idea of what the algorithms are doing to figure out what's wrong.

On i686, test_swt2 fails: i686-swt2 with this diff (might need to zoom in to see): i686-swt2-failed-diff

On ppc64le, test_swt fails: ppc64le-swt with this diff: ppc64le-swt-failed-diff

On ppc64le, test_swt2 fails: ppc64le-swt2 with this diff: ppc64le-swt2-failed-diff

On ppc64le, test_canny fails: ppc64le-canny with this diff: ppc64le-canny-failed-diff

QuLogic commented 6 years ago

Since swt has some debug infrastructure, I toggled OUTPUT_INTERMEDIATE_IMGS flag. On x86_64 (which passes), the output is:

SWT> 358084 rays found
SWT> 6410 possible letters found
SWT> Filtering 1: 6402 letters found
SWT> Filtering 2: 6349 letters found
SWT> 8898 valid pairs found
SWT> 2670 chains after merges
SWT> 9350 letters rendered

but on 32-bit, the output is:

SWT> 358084 rays found
SWT> 6410 possible letters found
SWT> Filtering 1: 6402 letters found
SWT> Filtering 2: 6349 letters found
SWT> 8898 valid pairs found
SWT> 2670 chains after merges
SWT> 9352 letters rendered

The swt_0003_gaussian, swt_0004_sobel_intensity, and swt_0005_sobel_direction differ, though the first two are barely visible. The diff on the last one is: swt_0005_sobel_direction-failed-diff But then the remaining intermediate images somehow match.

QuLogic commented 6 years ago

On ppc64le, the output for test_swt2 is:

SWT> 346931 rays found
SWT> 8777 possible letters found
SWT> Filtering 1: 8761 letters found
SWT> Filtering 2: 8677 letters found
SWT> 9489 valid pairs found
SWT> 2991 chains after merges
SWT> 9916 letters rendered

which is quite a bit different from either x86 attempt. This one fails starting from the swt_0002_canny file, but I guess that makes sense since test_canny also fails. swt_0002_canny-failed-diff but then barely differs after the Gaussian: swt_0003_gaussian-failed-diff or sobel intensity: swt_0004_sobel_intensity-failed-diff but is quite different after sobel direction: swt_0005_sobel_direction-failed-diff and swt contains many extra rays: swt_0006_swt-failed-diff and after the median too: swt_0007_ray_median-failed-diff The rest are basically a combination of the canny diff and the extra rays.

QuLogic commented 6 years ago

So it seems like the problems on ppc64le stem all the way back to the Gaussian convolution. For example, one point in an area that's all 253 becomes 252.9999999999999716 after the x convolution, then 252.9999999999999432 after the y convolution. Then going through the Sobel filter, some results are 0.0000000000000568 and some are -0.0000000000001705 (instead of 0) because those previous values were not quite right. Then the direction image produces 3pi/4 instead of 0 because of the slightly negative x value input to atan2 (which is where those large swaths of differences in that image arise).

I don't really understand why this only affects ppc64le that way, especially because there's also a ppc64 (big-endian) which works fine.

jflesch commented 6 years ago

It seems to be related to the use (or not) of SSE instructions: https://stackoverflow.com/questions/14749929/c-float-operations-have-different-results-on-i386-and-arm-but-why It did a quick test: Enabling explicitly SSE2 instructions makes the results consistent between i386 and amd64.

I assume those instructions do not exist at all on ppc64, so we will always have slight differences compared to i386/amd64. Therefore I'm not sure if it makes sense to run such strict tests on this kind of platform .. :/

jflesch commented 6 years ago

Fix for i386 : ed348171aaa9c4829bd9cd2de92c1fd9fa5cc90a

BTW, thanks for your investigation. Without it, it would have taken me a while to realize that the difference between i386 and amd64 came from floating point numbers.

QuLogic commented 6 years ago

PPC64 does not have SSE2; it has AltiVec. Still doesn't really explain why little-endian fails but not big-endian.

Unfortunately, Fedora supports i686 for 32-bit systems, which means no SSE2 support; I suppose I'll have to skip that test.