desihub / gpu_specter

Scratch work for porting spectroperfectionism extractions to GPUs
BSD 3-Clause "New" or "Revised" License
2 stars 3 forks source link

Add unit test for extract_frame #45

Closed dmargala closed 4 years ago

dmargala commented 4 years ago

This PR adds a unit test that compares the output of gpu_specter.core.extract_frame and specter.extract.ex2d using a small fraction of a full input image/psf. The unit test checks that the 95% of the values in the pull distribution are less than 0.01:

pull = abs(f1 - f2) / sqrt(var1 + var2)
assert average(pull < 0.01) >= 0.95

The PR also adds a helper script for comparing spex and specter that is a bit more flexible than the hardcoded unittest. I also updated the testing doc.

Potential issues:

dmargala commented 4 years ago

Full frame comparison of spex/cpu with desi_extract_spectra:

 > python bin/compare-frame -a $SCRATCH/spex_haswell_mpi32_gpu0.fits -b $SCRATCH/desi_haswell_mpi32_gpu0.fits
wave (allclose): True
(f_a, f_b):
   isclose:  70.17%
(f_a - f_b)/sqrt(var_a + var_b):
     1e-05:  60.38%
    0.0001:  73.73%
     0.001:  84.80%
      0.01:  94.74%
       0.1:  99.88%
       1.0: 100.00%
(ivar_a, ivar_b):
   isclose:  81.98%
(sigma_a - sigma_b)/sqrt(var_a + var_b):
     1e-05:  84.95%
    0.0001:  92.98%
     0.001:  99.14%
      0.01:  99.99%
       0.1:  99.99%
       1.0: 100.00%
(resolution_a, resolution_b):
   isclose:  81.34%

Full frame comparison of spex/cpu with spex/gpu:

> python bin/compare-frame -a $SCRATCH/spex_haswell_mpi32_gpu0.fits -b $SCRATCH/spex_skylake_mpi4_gpu2_mps.fits
wave (allclose): True
(f_a, f_b):
   isclose:  99.96%
(f_a - f_b)/sqrt(var_a + var_b):
     1e-05:  99.96%
    0.0001: 100.00%
     0.001: 100.00%
      0.01: 100.00%
       0.1: 100.00%
       1.0: 100.00%
(ivar_a, ivar_b):
   isclose:  99.97%
(sigma_a - sigma_b)/sqrt(var_a + var_b):
     1e-05: 100.00%
    0.0001: 100.00%
     0.001: 100.00%
      0.01: 100.00%
       0.1: 100.00%
       1.0: 100.00%
(resolution_a, resolution_b):
   isclose:  98.46%

Note that the instructions for generating the files mentioned here are in the gpu_specter top-level README

sbailey commented 4 years ago

The default np.isclose options are approximately checking if they agree to single-precision, but I think we expect the gpu_specter CPU and GPU paths to agree closer to double-precision, even if differences in boundary regions and regularization result in larger disagreements with current specter. Let's make sure we understand what leads to the CPU/GPU disagreement all the way down to double-precision machine precision.

dmargala commented 4 years ago

I can update unit tests that compare cpu/gpu code paths using np.isclose/np.allclose with something like:

eps_double = np.finfo(np.float64).eps
np.allclose(result_cpu, result_gpu, rtol=alpha*eps_double, atol=0)

Ideally alpha=1 but I doubt the tests will all pass that strict test currently. I can find an order of magnitude alpha needed for each test to pass and then we can evaluate where we want to dig more on disagreement once we know those alphas.

@sbailey does that sound good?

sbailey commented 4 years ago

@dmargala sounds good, thanks. Not sure about atol=0 though. Main point is to flush out any algorithmic / configuration differences in the CPU vs. GPU path, and whether there are any gotchas like a CPU library using double precision but a GPU library using single precision without us knowing.

dmargala commented 4 years ago

atol=0 inspired by the default value for numpy.testing.assert_allclose and also this article

FWIW, it looks like numpy uses rtol=1e-11 rather than rtol=eps_double # approx 2e-19 in the numpy.linalg.eigh unit tests for double precision input array. See here and here

dmargala commented 4 years ago

So far the largest discrepancies I've seen between the cpu and gpu version stem from the eigenvalue decomposition. Here is a simplified snippet for demonstration:

np.random.seed(123)
eps_double = np.finfo(np.float64).eps
n = 250

# cpu / numpy
A = np.eye(n) + 1e-12*np.random.randn(n, n)
w, v = np.linalg.eigh(A)
Q = (v * np.sqrt(w)).dot(v.T)
s = np.sum(Q, axis=1)
R = Q/s[:, np.newaxis]

# gpu / cupy
A_gpu = cp.asarray(A)
w_gpu, v_gpu = cp.linalg.eigh(A_gpu)
Q_gpu = (v_gpu * cp.sqrt(w_gpu)).dot(v_gpu.T)
s_gpu = cp.sum(Q_gpu, axis=1)
R_gpu = Q_gpu/s_gpu[:, cp.newaxis]

# np.testing.assert_allclose(cp.asnumpy(Q_gpu), Q, rtol=eps_double)
np.testing.assert_allclose(cp.asnumpy(R_gpu), R, rtol=eps_double)

The assertion fails with:

AssertionError: 
Not equal to tolerance rtol=2.22045e-16, atol=0

Mismatched elements: 62487 / 62500 (100%)
Max absolute difference: 1.02140518e-14
Max relative difference: 23.87078597
 x: array([[ 1.000000e+00,  9.313377e-13,  3.760174e-13, ..., -3.391993e-13,
         4.335626e-13,  2.473715e-13],
       [ 9.313388e-13,  1.000000e+00,  3.521327e-14, ...,  4.320984e-13,...
 y: array([[ 1.000000e+00,  9.313270e-13,  3.760022e-13, ..., -3.393037e-13,
         4.334932e-13,  2.472232e-13],