Unexpected behaviour of compare outputs?

JunShern commented 6 years ago

So I am using

make serenity_now_gaussian_blur

To run and test correctness of my modified version of the gaussian_blur execute function. For this, I have created a ExecuteV2 function which is selected through an environment variable.

As a sanity check, I set the acc variable to zero right before the output line, expecting to see it fail the comparison test.

    void ExecuteV2(
        puzzler::ILog *log,
        const puzzler::GaussianBlurInput *pInput,
        puzzler::GaussianBlurOutput *pOutput
        ) const
    {
        log->LogInfo("I am V2, I AM A HUGE MISTAKE");
        pOutput->pixels.resize(pInput->width * pInput->height);

        for(int xOut=0; xOut < (int)pInput->width; xOut++){
            log->LogVerbose("column = %u", xOut);
            for(int yOut=0; yOut < (int)pInput->height; yOut++){
                log->LogDebug("row = %u", yOut);

                double acc=0.0;
                for(int xIn=0; xIn < (int)pInput->width; xIn++){
                    for(int yIn=0; yIn < (int)pInput->height; yIn++){
                        double contrib = hashedCoeff(xIn-xOut, yIn-yOut, pInput->radius);
                        acc += contrib * pInput->pixels[yIn*pInput->width+xIn];
                    }
                }

                if(acc<0){
                    acc=0;
                }else if(acc>255){
                    acc=255;
                }

                acc=0; // INTENTIONALLY MAKING A HUGE MISTAKE, PLEASE REMOVE LATER
                pOutput->pixels[ pInput->width*yOut+xOut ] = (uint8_t)acc;
            }
        }
    }

To my surprise, it passes the test!

bin/run_puzzle gaussian_blur 100 2
LogLevel = 2 -> 2
[run_puzzle], 1510947330.43, 2, Created log.
[run_puzzle], 1510947330.43, 2, Creating random input
[run_puzzle], 1510947330.43, 2, Executing puzzle
[run_puzzle], 1510947330.43, 2, I am V2, I AM A HUGE MISTAKE
[run_puzzle], 1510947330.43, 2, Executing reference
[run_puzzle], 1510947341.20, 2, Checking output
[run_puzzle], 1510947341.20, 2, Output is correct
bin/create_puzzle_input gaussian_blur 100 2 > w/gaussian_blur.in
LogLevel = 2 -> 2
[run_puzzle], 1510947341.20, 2, Created log.
[run_puzzle], 1510947341.20, 2, Creating random input
[run_puzzle], 1510947341.20, 2, Writing data to stdout
cat w/gaussian_blur.in | bin/execute_puzzle 1 2 > w/gaussian_blur.ref.out
[execute_puzzle], 1510947341.21, 2, Created log.
[execute_puzzle], 1510947341.21, 2, Loaded input, puzzle=gaussian_blur
[execute_puzzle], 1510947341.21, 2, Begin reference
[execute_puzzle], 1510947352.14, 2, Finished reference
cat w/gaussian_blur.in | bin/execute_puzzle 0 2 > w/gaussian_blur.got.out
[execute_puzzle], 1510947352.14, 2, Created log.
[execute_puzzle], 1510947352.14, 2, Loaded input, puzzle=gaussian_blur
[execute_puzzle], 1510947352.14, 2, Begin execution
[execute_puzzle], 1510947352.14, 2, I am V2, I AM A HUGE MISTAKE
[execute_puzzle], 1510947352.14, 2, Finished execution
bin/compare_puzzle_output w/gaussian_blur.in w/gaussian_blur.ref.out w/gaussian_blur.got.out 2
LogLevel = 2 -> 2
[execute_puzzle], 1510947352.15, 2, Created log.
[execute_puzzle], 1510947352.15, 2, Loading input w/gaussian_blur.in
[execute_puzzle], 1510947352.15, 2, Creating puzzle gaussian_blur to match input
[execute_puzzle], 1510947352.15, 2, Loading reference w/gaussian_blur.ref.out
[execute_puzzle], 1510947352.15, 2, Loading got w/gaussian_blur.got.out
[execute_puzzle], 1510947352.15, 2, Outputs are equal.

I tried setting acc to a different value, for example 1234, which DOES fail the test as expected.

bin/run_puzzle gaussian_blur 100 2
LogLevel = 2 -> 2
[run_puzzle], 1510947063.43, 2, Created log.
[run_puzzle], 1510947063.43, 2, Creating random input
[run_puzzle], 1510947063.43, 2, Executing puzzle
[run_puzzle], 1510947063.43, 2, I am V2, I AM A HUGE MISTAKE
[run_puzzle], 1510947063.43, 2, Executing reference
[run_puzzle], 1510947074.04, 2, Checking output
[run_puzzle], 1510947074.04, 0, Output is not correct.
makefile:26: recipe for target 'serenity_now_gaussian_blur' failed
make: *** [serenity_now_gaussian_blur] Error 1

Is there a mistake somewhere or did I accidentally stumble upon the secret trick to this puzzle?

malharjajoo commented 6 years ago

I had the same issue @JunShern .

The idea is that in the "CompareOutputs(some params) " function in include/puzzler/puzzles/gaussian_blur.hpp, there is a std::abs(r-g) > 2 and it is the check for correctness used by bin/compare_puzzle_output.

I think as the scale increases, this check becomes less and less useful since the exponential causes coeff( ) to decrease a lot causing outputs to always the std::abs check above.

How do we know if our solution is correct, @m8pple if such radical changes do not affect correctness ? I also changed the correctness measure ( although I am aware we are not allowed to modify anything in the include folder, but just for checking purposes ) from std::abs(r-g) > 2 to std::abs(r-g) > 0 .... and I still noticed that incorrect code passed the output correctness check.

m8pple commented 6 years ago

I don't get this problem (nor do I expect it to happen).

If I add acc=0 as suggested, then I get:

$ make serenity_now_gaussian_blur
mkdir -p w
bin/run_puzzle gaussian_blur 100 2
LogLevel = 2 -> 2
[run_puzzle], 1511029481.94, 2, Created log.
[run_puzzle], 1511029481.94, 2, Creating random input
[run_puzzle], 1511029481.94, 2, Executing puzzle
[run_puzzle], 1511029482.13, 2, Executing reference
[run_puzzle], 1511029484.76, 2, Checking output
[run_puzzle], 1511029484.76, 0, Output is not correct.
make: *** [makefile:30: serenity_now_gaussian_blur] Error 1

Taking it back out, it is correct again.

I ran it a few times in both modes to get different test images, and it behaves the same way in both cases.

If I do a scale of 100, and print out (x,y) -> acc, sum(coeff), then I get things looking like:

[execute_puzzle], 1511029920.66, 2,   (9,91) -> 35.477, 0.999972
[execute_puzzle], 1511029920.66, 2,   (9,92) -> 32.0441, 0.999826
[execute_puzzle], 1511029920.66, 2,   (9,93) -> 28.5868, 0.999042
[execute_puzzle], 1511029920.66, 2,   (9,94) -> 24.9651, 0.995683
[execute_puzzle], 1511029920.66, 2,   (9,95) -> 21.0596, 0.98418
[execute_puzzle], 1511029920.66, 2,   (9,96) -> 16.8982, 0.952686
[execute_puzzle], 1511029920.66, 2,   (9,97) -> 12.6992, 0.883751
[execute_puzzle], 1511029920.66, 2,   (9,98) -> 8.80583, 0.763122
[execute_puzzle], 1511029920.66, 2,   (9,99) -> 5.55, 0.594367
[execute_puzzle], 1511029920.66, 2,   (10,0) -> 35.1927, 0.594369
[execute_puzzle], 1511029920.66, 2,   (10,1) -> 45.4544, 0.763124
[execute_puzzle], 1511029920.66, 2,   (10,2) -> 52.851, 0.883753
[execute_puzzle], 1511029920.66, 2,   (10,3) -> 57.0297, 0.952688
[execute_puzzle], 1511029920.66, 2,   (10,4) -> 58.7645, 0.984183
[execute_puzzle], 1511029920.67, 2,   (10,5) -> 59.0889, 0.995686
[execute_puzzle], 1511029920.67, 2,   (10,6) -> 58.6992, 0.999045
[execute_puzzle], 1511029920.67, 2,   (10,7) -> 57.8574, 0.999829
[execute_puzzle], 1511029920.67, 2,   (10,8) -> 56.5617, 0.999975
[execute_puzzle], 1511029920.67, 2,   (10,9) -> 54.7454, 0.999997
[execute_puzzle], 1511029920.67, 2,   (10,10) -> 52.4031, 0.999999
[execute_puzzle], 1511029920.67, 2,   (10,11) -> 49.6283, 1
[execute_puzzle], 1511029920.67, 2,   (10,12) -> 46.5782, 1
[execute_puzzle], 1511029920.67, 2,   (10,13) -> 43.4142, 1
[execute_puzzle], 1511029920.67, 2,   (10,14) -> 40.2494, 1
[execute_puzzle], 1511029920.67, 2,   (10,15) -> 37.0975, 1
[execute_puzzle], 1511029920.67, 2,   (10,16) -> 33.8535, 1
[execute_puzzle], 1511029920.67, 2,   (10,17) -> 30.3817, 1
[execute_puzzle], 1511029920.67, 2,   (10,18) -> 26.6544, 1
[execute_puzzle], 1511029920.67, 2,   (10,19) -> 22.7754, 1
[execute_puzzle], 1511029920.67, 2,   (10,20) -> 18.8785, 1
[execute_puzzle], 1511029920.67, 2,   (10,21) -> 15.0712, 1
[execute_puzzle], 1511029920.67, 2,   (10,22) -> 11.4817, 1
[execute_puzzle], 1511029920.67, 2,   (10,23) -> 8.2857, 1
[execute_puzzle], 1511029920.67, 2,   (10,24) -> 5.64445, 1
[execute_puzzle], 1511029920.67, 2,   (10,25) -> 3.63062, 1
[execute_puzzle], 1511029920.67, 2,   (10,26) -> 2.21258, 1
[execute_puzzle], 1511029920.67, 2,   (10,27) -> 1.28635, 1

So setting acc=0 will definitely fail.

A few things to check:

Are you definitely synced with the master spec? The coefficients in the original release would decrease exponentially to zero (which would affect things at large scales), but the fix from last weekend should stop that. The coefficients will sum to 1 in the interior of the image, so setting acc=0 should not result in a pass.
Does it keep happening if you try running it at different times? The image generated is time dependent (to within a second), so it might generate values which are close to zero, but the next time it wouldn't (though I thought it was designed to include noise).
Have you tried printing the values of the pixels?
If you force the input image to have large values (e.g. by changing t=std::max(0.0f, std::min(1.0f, t)); to t=255.0, do you still get the same effect?

JunShern commented 6 years ago

Ohh yes it was an oversight on my part, I missed the latest few changes you made on the main branch. In particular, it seems that the commit fixing #10 solves the problem. Sorry for the trouble and thanks for the help!

@malharjajoo try merging in the commit from here, that should solve the problem.

HPCE / hpce-2017-cw5

Unexpected behaviour of compare outputs? #25