daphne-eu / daphne

DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines
Apache License 2.0
67 stars 62 forks source link

Numerical precision of DaphneDSL matrix calculations #824

Open divjakm opened 2 months ago

divjakm commented 2 months ago

I have a problem with numerical precision of calculations on matrices in f64 data type. I'm trying to calculate eigenvalues of a matrix with the inbuilt eigen function, but the function complains that the input matrix is not symmetric. However, my input matrix is actually a square (auto)correlation matrix, which should always be symmetric. I tested this in Matlab and Python and I always get perfectly symmetric result.

I looked into why the matrix is not symmetric and I see that indeed there are some values which should be the same in the upper/lower triangle, but are actually different. The differences are very small, in the range of 10^-12 to 10^-20, but this is enough that the eigen function doesn't work.

I tested extensively why this difference happens on random square matrices and I see some unusual behavior when I change the size of the input matrix. For most matrices larger than 100 x 100 elements, the result is not completely symmetric. But then sometimes it is perfectly symmetric, and this seems to be related to the matrix size (N), not to the actual matrix values (which are randomly generated). For example:

N = 1000: is not symmetric, residual = 2.84217e-13 N = 2000: is symmetric, residual = 0 N = 3000: is not symmetric, residual = 7.95808e-13 N = 4000: is symmetric, residual = 0 N = 5000: is not symmetric, residual = 1.13687e-12 N = 6000: is symmetric, residual = 0 N = 7000: is not symmetric, residual = 1.36424e-12 N = 8000: is symmetric, residual = 0 N = 9000: is not symmetric, residual = 2.27374e-12 N = 10000: is symmetric, residual = 0

For some N I always get perfectly symmetric results, even if I repeat the test many times. This seems weird. Does anybody see any explanation for this behavior? My test script is this:

// Test for numerical accuracy and stability of DAPHNE operations (correlation matrix).
def testSymmetry (N: si64) -> f64
{
    Y1 = as.matrix<f64>(rand (N, N, 0.0, 1.0, 1, -1)); // input matrix with random values between 0 and 1

    // calculate autocorrelation matrix, should be symmetrical
    Ry = Y1 @ transpose (Y1);

    // test for symmetry: subtract transposed matrix, should result in all zeros
    TMP = abs (Ry - transpose(Ry));

    return aggMax(TMP); // return maximal matrix element
}

// MAIN:
for (n in 1000 : 10000 : 1000)
{
    r = testSymmetry (n);
    if (r < 1e-20)
    {
        print ("N = " + as.scalar(n) + ": is symmetric, residual = ", false);
        print (r);      // print r separately to see the full precision
    }
    else
    {
        print ("N = " + as.scalar(n) + ": is not symmetric, residual = ", false);
        print (r);      // print r separately to see the full precision
    }
}

print ("Done.");