StochasticAnalytics / emClarity

GNU Lesser General Public License v3.0
41 stars 6 forks source link

Tests and updates #112

Closed thomasfrosio closed 4 years ago

thomasfrosio commented 4 years ago

Testing framework

Almost every EMC function now comes with an unit test and some with test fixtures. As explained below, some unit tests have very specific and robust checks which is usually enough to release a function. On the other hand, some functions are complicated to check (ex: check the taper was correctly applied to an image), like EMC_resize or EMC_coordGrids which then really benefits from fixture tests. The idea is to progressively add more tests, but this will come naturally when developping the functions.

Function Unit-test Fixture Status:Priority
masking/EMC_resize.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
masking/EMC_maskReference.m :x: :x: :x: 1
masking/EMC_maskShape.m :heavy_check_mark: :x: :heavy_check_mark:
masking/EMC_limits.m :heavy_check_mark: :x: :heavy_check_mark:
masking/EMC_taper.m :heavy_check_mark: :x: :heavy_check_mark:
masking/EMC_getBandpass.m :heavy_check_mark: :x: :heavy_check_mark:
masking/EMC_applyBandpass.m :heavy_check_mark: :x: :heavy_check_mark:
masking/EMC_gaussianKernel.m :heavy_check_mark: :x: :heavy_check_mark:
masking/EMC_centerOfMass.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
coordinates/EMC_coordVectors.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
coordinates/EMC_coordGrids.m :heavy_check_mark: :x: :heavy_check_mark:
coordinates/EMC_coordTransform.m :x: :x: :x: 4
testScripts/EMC_getClass.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
testScripts/EMC_getOption.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
testScripts/EMC_is3d.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
testScripts/EMC_isOnGpu.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
testScripts/EMC_setMethod.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
testScripts/EMC_setPrecision.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
testScripts/EMC_shareMethod.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
testScripts/EMC_sharePrecision.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
testScripts/EMC_convn.m :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
testScripts/EMC_rfftn.m :x: :x: :x: 2
testScripts/EMC_irfftn.m :x: :x: :x: 3

The main idea:

This current framwork is a fast and robust way to implement tests for EMC functions. It saves test inputs, outputs and results into a mat file (one per test function). It also allows debugging (possibility to pause when a test fails) and is integrated into MATLAB testing framework. This pull request comes with 3 types of files:


The workflow

A. Create a test script in testScripts/EMC_test starting with test with the following function:

% This will grap every function starting with 'test' in this file and run them.
% It also makes this script an 'official' test for MATLAB and can be ran with the
% 'runtests' function.
function tests = test_<nameOfTheFunctionToTest>
    tests = functiontests(localfunctions)
end

B. (Optional) Create a function called setupOnce. This function is ran only once and is optional. It is usually where common parameters are initiated.

% This is the setup to every test from this file. This can be defined within the
% testing function, but as every test use the same functionToTest and
% evaluations (usually), it is useful to define commun parameters beforehand.
function setupOnce(testCase)
    % example of function
    testCase.TestData.functionToTest = @EMC_gaussianKernel;
    % to check the validity of the output
    testCase.TestData.evaluateOutput = @evaluateOutput;
end

C. Create the evalutation function. This function checks the validity of the output and is only ran if the functionToTest did NOT raise an error. As explained in more detail later, when the functionToTest gives an output, EMC_runTest needs to know whether or not this output is correct. As such, it will send the output(s) to the evaluation function, which will tell whether or not the test is successful. Long story short: testCase.TestData.evaluateOutput should contain the handle to the output evaluation function.

function [result, message] = evaluateOutput(input1, ..., outputCellToCheck, extra)
    % You can define any check you want here... The only thing that matters is
    % that 'result' and 'message' are defined.
    result = 'passed';
    message = '';

    % result (str):             Whether or not the output pass the test;
    % message (str):            For logging purposes. Display this message if
    %                           result ~= 'passed'.
    % outputCellToCheck (cell): 1xn cell array containing the output to check.
    % extra (anything):         This variable is a place holder to send anything
    %                           you want (fixture,
    %                           expected results, etc.) from the test function
    %                           to the evaluation.
    %                           This allow more freedom when designing tests.
end

Here is an example for the function EMC_gaussianKernel:

% KERNEL = EMC_gaussianKernel(SIZE, SIGMA, OPTION)
function [result, message] = evaluateOutput(SIZE, SIGMA, OPTION, ...
                                            outputCellToCheck, extra)
    result = 'failed';

    % check that the function returns only one input
    if length(outputCellToCheck) ~= 1
        message = 'only one ouput is expected';
        return
    else
        kernel = outputCellToCheck{1};
    end

    % check that the size of the output kernel
    % is equal to the desired size
    if ~isequal(SIZE, size(kernel))
        message = 'kernel size is not equal to desired size';
        return
    end

    % check the precision and method of the kernel,
    % which can be defined via the OPTION cell
    [actualMethod, actualPrecision] = help_getClass(kernel);
    if help_isOptionDefined(OPTION, 'precision')
        expectedPrecision = help_getOptionParam(OPTION, 'precision');
    else
        expectedPrecision = 'single';  % default
    end
    if help_isOptionDefined(OPTION, 'method')
        expectedMethod = help_getOptionParam(OPTION, 'method');
    else
        expectedMethod = 'cpu';  % default
    end
    if ~strcmp(actualMethod, expectedMethod)
        message = sprintg('expected output method = %s', actualMethod);
        return
    elseif ~strcmp(actualPrecision, expectedPrecision)
        message = sprintf('expected output precision = %s', actualPrecision);
        return
    end

    % the kernel are normalized to have the sum of the weights to 1.
    if sum(kernel, 'all') - 1 > 1e-5
        message = sprintf('expected sum = 1, got %s', sum(kernel, 'all'));
        return
    end

    % finally, you can define exactly what the output kernel
    % should be and you want to make sure it is equal to the output
    if extra
        if any(abs(kernel - extra) > 1e-5)
            message = 'output kernel is not equal to expected kernel';
        return
        end
    end

    % if the output satisfy your checks then:
    result = 'passed';  % can be 'passed', 'warning', 'failed'
    message = '';
end

D. Now it is time to actually write a test. In the test script, create a (or as many as you want) function called test_<nameOfTheTest>. This function will have to define the function to test (or use setupOnce as in the example above) and the inputs that you want to test. All this information should be stored in testCase.TestData (see example below). Finally, the test function should call EMC_runTest(testCase). EMC_runTest will then run sequentially every inputs you defined. For each sequence of inputs:

% extra (anything): See above for more details. % This is send to the evaluation function.

% dimensions: cell(nTests, (nInputs + 2)) % format: {input1A, input1B, input1C, ..., expectedError, extra; % input2A, input2B, input2C, ..., expectedError, extra; % ... % inputNA, inputNB, inputNC, ..., expectedError, extra};


**E**. The ```help_*``` functions are here to help. If you have a function with 3 inputs, plus one OPTION cell with 3 default paramaters, testing all the possible combinaison is not something you wish to do manually. ```help_getBatch``` and ```help_getBatchOption``` can do this job for you by creating a cell with the same format as the ```testCase.TestData.toTest``` with every single combinaison of intputs. Let take a simple example:

```matlab
% One example of a test function for EMC_gaussianKernel.
function test_2d(testCase)
    % EMC_gaussianKernel has the following inputs (SIZE, SIGMA and OPTION).
    % SIZE [x,y]:
    sizes = help_getRandomSizes(5, [3, 20], '2d');
    % 5x1 cell with random 2d sizes between 3 and 20

    % SIGMA [float | [x,y]]:
    sigmas = [help_getRandomSizes(2, [1,5], '2d'); ...
              help_getRandomSizes(2, [1,5], '1d')];

    % OPTION:
    % This generates every possible combinaison (9) of optional parameters.
    option = help_getBatchOption({'precision', {'single'; 'double'}; ...
                                  'method', {'cpu'; 'gpu'}});

    % Combining every inputs while creating every combinaison of fixed parameters.
    inputs = help_getBatch(sizes, ...    % 5 combinaisons
                           sigmas, ...   % 4 combinaisons
                           option, ...   % 9 combinaisons: total of 180 combinaisons
                          {false}, ...  % expectedError
                          {false}); ... % extra

    % Send these inputs to EMC_runTest for evaluation.
    testCase.TestData.toTest = inputs;
    EMC_runTest(testCase);  % test every inputs.
end

F. What if the functionToTest takes an image or something that takes a lot of memory? Do you have to preallocate the input image for every tests beforehand? No, you don't. EMC_runTest has a fixture feature (see help EMC_runTest) which can create fixtures on the fly. This is very usefull when you define thousands of tests and you cannot preallocate every example beforehand. Here is the basic syntax to say to EMC_runTest that a parameter is a fixture.

function test_fixture(testCase)
    % let say with want to test EMC_resize, which takes an IMAGE, LIMITS and OPTION.
    % if you want to try different images, with different sizes, different precision and methods,
    % you can use the fixture feature:

    testCase.TestData.fixtureImg = @help_getInputRand;  % this can be any function
    % help_getInputRand takes 3 inputs (method, precision and size) and returns a random image.
    methods = {'cpu'; 'gpu'};
    precisions = {'single'; 'double'};
    sizes = help_getRandomSizes(3, [100, 200], '2d');

    % EMC_runTest will call the function in testCase.TestData.fixtureImg using these
    % inputs and use the output as input for the test. Once the test is done, it deletes it.
    IMAGE = help_getBatch({'fixtureImg', methods, precisions, sizes})  % 12 examples of IMAGE

    LIMITS = {[0,0,0,0]; [10,10,10,10]; [-10,10,0,10]};
    OPTION = {{}};  % default options

    % 12x3=36 combinaisons
    testCase.TestData.toTest = help_getBatch(imgs, limits, options, {false}, {false});
    EMC_runTest(testCase);
end

In pratice, EMC_resize has ~10000 tests in total.

G. If you know that a given combinaison of inputs will result in an error, you can use the expectedOutput parameter to let EMC_runTest know that you expect an error. See D for more detail. These are called assumption tests.

% img = EMC_resize(img, limits, option);
testCase.TestData.toTest = help_getBatch(rand(128,128), [0,0] , {}, {'EMC:limits'}, {false});
% limits should have 4 elements for a 2d image, therefore EMC_resize will raise an error here.
% By setting the expectedOutput to = 'EMC:limits', we let EMC_runTest know that this test should
% raise an error with the id: 'EMC:limits'.

H. Note:


Updates: