SyneRBI / SIRF-SuperBuild

SIRF CMake SuperBuild
http://www.ccpsynerbi.ac.uk
Apache License 2.0
15 stars 18 forks source link

re-enable some CIL tests depending on SIRF allocate() #846

Closed KrisThielemans closed 7 months ago

KrisThielemans commented 10 months ago

Now that https://github.com/SyneRBI/SIRF/issues/1212 is fixed, we can reenable relevant tests

Reverts https://github.com/SyneRBI/SIRF-SuperBuild/pull/838/commits/160d5e5ac8396efb29322fe354083a7e5b5d8ad8

casperdcl commented 10 months ago

Based on the failing CI doesn't look like it's fixed yet /CC @evgueni-ovtchinnikov :sweat_smile:

KrisThielemans commented 10 months ago

Not so clear if there's still a SIRF problem. It'll only be the DEVEL_BUILD=ON jobs that would work at the moment. That creates suitable complications...

Looks like only one of your CI jobs tests with DEVEL_BUILD=ON: https://github.com/SyneRBI/SIRF-SuperBuild/blob/42ce8cbe1e5c18a96e4436fc2d79e87e7accdc73/.github/workflows/c-cpp.yml#L58-L63

Let's hope that https://github.com/SyneRBI/SIRF-SuperBuild/actions/runs/6956888225/job/18928662469?pr=846 works therefore, and the docker devel build (but that'll likely get aobrted first if one of the others fails).

In fact, we really should do the SIRF_CIL tests are part if the SIRF CI, but they aren't. I suppose we're not building CIL there. Oh well.

PS: By the way, it's a pain to figure out which job is which. we should do something smarter with naming.

evgueni-ovtchinnikov commented 10 months ago

I did not expect the use of allocate(None) (looks bizarre to me), quick fix for backward compatibility just pushed.

...to the wrong branch

pushed to master now

KrisThielemans commented 10 months ago

Unrelated failure in DEVEL_BUILD=ON](https://github.com/SyneRBI/SIRF-SuperBuild/actions/runs/6956888225/job/18928662469?pr=846#step:11:1095) and the docker devel-service build: https://github.com/TomographicImaging/CIL/issues/1587

KrisThielemans commented 10 months ago

By the way, no reason we test the docker CIL tests only in the service images I guess

KrisThielemans commented 8 months ago

Possibly needs version_config.cmake update for SIRF_TAG to current version, but not sure if that's what is still the problem.

KrisThielemans commented 7 months ago

just rebased this. After all the recent changes, maybe it'll work...

KrisThielemans commented 7 months ago

Fails with

connection to gadgetron server failed, trying again

We do try and run it https://github.com/SyneRBI/SIRF-SuperBuild/blob/8ca50676b4b3140beb960dc76f2f6ef5d2468ba4/.github/workflows/test_cil.sh#L12 it is executed https://github.com/SyneRBI/SIRF-SuperBuild/actions/runs/8092243834/job/22112637500#step:10:41, but we don't see the log

Possibly due to #722

KrisThielemans commented 7 months ago

@casperdcl to edit .github/workflows/test_cil.sh: remove explicit paths and possibly we need to source env_sirf.sh first.

casperdcl commented 7 months ago

https://github.com/TomographicImaging/CIL/blob/v23.1.0/Wrappers/Python/test/test_SIRF.py one test failing:

test_TVdenoisingMR (test_SIRF.TestGradientMR_2D) ...
WARNING:root:C backend is only for arrays of datatype float32 - defaulting to `numpy` backend
...
WARNING:root:C backend is only for arrays of datatype float32 - defaulting to `numpy` backend
FAIL
...
======================================================================
FAIL: test_TVdenoisingMR (test_SIRF.TestGradientMR_2D)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jovyan/cil_sirf_test/test_SIRF.py", line 259, in test_TVdenoisingMR
    np.testing.assert_array_almost_equal(res1.as_array(), res2.as_array(), decimal=3)
Iterations stopped at 100 with the residual 0.005245 
Iterations stopped at 100 with the residual 0.000000 
Iterations stopped at 100 with the residual 0.000500 
  File "/opt/conda/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/opt/conda/lib/python3.10/site-packages/numpy/testing/_private/utils.py", line 1099, in assert_array_almost_equal
    assert_array_compare(compare, x, y, err_msg=err_msg, verbose=verbose,
  File "/opt/conda/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/opt/conda/lib/python3.10/site-packages/numpy/testing/_private/utils.py", line 862, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 3 decimals

Mismatched elements: 74722 / 131072 (57%)
Max absolute difference: 0.14627695
Max relative difference: 0.3511309
 x: array([[[0.198+0.j, 0.195+0.j, 0.191+0.j, ..., 0.203+0.j, 0.202+0.j,
         0.199+0.j],
        [0.198+0.j, 0.196+0.j, 0.193+0.j, ..., 0.204+0.j, 0.202+0.j,...
 y: array([[[0.196+0.j, 0.195+0.j, 0.193+0.j, ..., 0.204+0.j, 0.202+0.j,
         0.201+0.j],
        [0.196+0.j, 0.195+0.j, 0.194+0.j, ..., 0.203+0.j, 0.202+0.j,...
KrisThielemans commented 7 months ago

Well, the good thing is that GradientPET BlockDataContainer tests work!

The failing test is https://github.com/TomographicImaging/CIL/blob/7e0291ea455a4aa037900bf248817a539a7b9d67/Wrappers/Python/test/test_SIRF.py#L249-L258

    def test_TVdenoisingMR(self):

        # compare inplace proximal method of TV
        alpha = 0.5
        TV = alpha * TotalVariation(max_iteration=10, warm_start=False)
        res1 = TV.proximal(self.image1, tau=1.0)

        res2 = self.image1*0.
        TV.proximal(self.image1, tau=1.0, out=res2)   
        np.testing.assert_array_almost_equal(res1.as_array(), res2.as_array(), decimal=3)

Checking the error message, it appears that res1 and res2 are almost equal, just not to the required precision. However, that's pretty weird of course. Why would assigning the result be different from storing it in out, unless the iterative process didn't converge yet?

@MargaretDuff @paskino @epapoutsellis is this different due to the warm_start=False in the first call, but not in the second call? If so, it seems a bug in the test.

MargaretDuff commented 7 months ago

Why would assigning the result be different from storing it in out, unless the iterative process didn't converge yet?

With 10 inner iterations, it may not have converged... But then I don't know where the randomness will have come from as the dual in the proximal calculation is allocated with zeros.

@MargaretDuff @paskino @epapoutsellis is this different due to the warm_start=False in the first call, but not in the second call? If so, it seems a bug in the test.

The warm_start=False call is used when TV is defined not when prox is called so this should be fine.

The TV.proximal is not safe to use in-place at the moment so TV.proximal(x, tau, out=x) is incorrect, but this should be ok here because you use TV.proximal(x, tau, out=x*0)?

casperdcl commented 7 months ago

The tests were fixed in CIL but not yet released. Pushed a temp fix for now, to be reverted after https://github.com/TomographicImaging/CIL/issues/1732.

I suggest we merge this & open a follow-up issue to track https://github.com/TomographicImaging/CIL/issues/1732.