Closed casperdcl closed 11 months ago
@paskino I'm not sure this is actually required, since we now have RUN_CTEST
https://github.com/SyneRBI/SIRF-SuperBuild/blob/c7b69730486e2c9e26d3654a5964631df3c5fb3d/docker/user_sirf-ubuntu.sh#L37
I think it's very good. It picks the version from
version_config.cmake
and uses that.
but we're getting CIL via conda...
Error is
++ docker run --rm synerbi/sirf:service python -c 'import cil; print(cil.version.commit_hash[1:])'
error: exec: "python": executable file not found in $PATH
+ DEFAULT_CIL_TAG='Creating sirfuser:1000:1000
Adding user `sirfuser'\'' to group `users'\'' ...
Adding user sirfuser to group users
Done.
Updating file ownership for /home/sirfuser
Switching to sirfuser and executing python -c import cil; print(cil.version.commit_hash[1:])'
So, aside from not finding python
, the docker run
adds lots of other things to the output. I guess it could be avoided by doing a dummy docker run
first, such that sirfuser
is already created. Sigh
btw how urgent is fixing this?
I was thinking of reworking/simplifing the docker stuff... but I expect it would take some time to get a more robust solution working.
We would like to merge this in and then shortly release SIRF 3.5.0 with relevant tested docker images.
Or we don't merge it in, and test the docker images manually.
first problem: cil.version
is borked. Hack for now:
docker run --rm -it synerbi/sirf:service /opt/conda/bin/python
>>> import cil
>>> cil.version.version
'23.0.1'
>>> cil.version.commit_hash # borked
'-1'
>>> f'v{cil.version.version}' if cil.version.commit_hash == '-1' else cil.version.commit_hash
'v23.0.1'
second problem: sirf
isn't found
docker run --rm -v ./cil_sirf_test:/cil_sirf_test synerbi/sirf:service /opt/conda/bin/python -m unittest discover -v /cil_sirf_test
...
----------------------------------------------------------------------
TEST SYSTEM CONFIGURATION
CIL version: 23.0.1
{'has_astra': True,
'has_ccpi_regularisation': False,
'has_cvxpy': False,
'has_ipp': True,
'has_numba': True,
'has_nvidia': False,
'has_tigre': True,
'has_tomophantom': False}
----------------------------------------------------------------------
test_Gradient (test_SIRF.TestGradientMR_2D) ... skipped 'Skipping as SIRF is not available'
test_TVdenoisingMR (test_SIRF.TestGradientMR_2D) ... skipped 'Has SIRF'
test_Gradient (test_SIRF.TestGradientPET_2D) ... skipped 'Skipping as SIRF is not available'
test_Gradient (test_SIRF.TestGradientPET_3D) ... skipped 'Skipping as SIRF is not available'
test_BlockDataContainer_with_SIRF_DataContainer_add (test_SIRF.TestSIRFCILIntegration) ... skipped 'Has SIRF'
test_BlockDataContainer_with_SIRF_DataContainer_divide (test_SIRF.TestSIRFCILIntegration) ... skipped 'Has SIRF'
test_BlockDataContainer_with_SIRF_DataContainer_multiply (test_SIRF.TestSIRFCILIntegration) ... skipped 'Has SIRF'
test_BlockDataContainer_with_SIRF_DataContainer_subtract (test_SIRF.TestSIRFCILIntegration) ... skipped 'Has SIRF'
----------------------------------------------------------------------
Ran 8 tests in 0.000s
OK (skipped=8)
third problem: PYTHON_INSTALL_DIR
should be replaced with /opt/conda
but for some reason isn't
docker run --rm synerbi/sirf:service cat .bashrc | grep activate
[ -f PYTHON_INSTALL_DIR/bin/activate ] && . PYTHON_INSTALL_DIR/bin/activate
hack for now:
docker run --rm -v ./cil_sirf_test:/cil_sirf_test synerbi/sirf:service bash --login -c '/opt/conda/bin/python -m unittest discover -v /cil_sirf_test'
fourth problem: gadgetron isn't running
Started reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%..
Finished reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
connection to gadgetron server failed, trying again...
...
connection to gadgetron server failed
ERROR
...
----------------------------------------------------------------------
Ran 8 tests in 7.271s
FAILED (errors=6)
hack for now:
#file: ./cil_sirf_test/run.sh
this=$(dirname "${BASH_SOURCE[0]}")
$SIRF_PATH/../../INSTALL/bin/gadgetron >& ~/gadgetron.log&
/opt/conda/bin/python -m unittest discover -v $this
for i in $(jobs -p); do kill -n 15 $i; done 2>/dev/null
docker run --rm -v ./cil_sirf_test:/cil_sirf_test synerbi/sirf:service bash --login /cil_sirf_test/run.sh
fifth problem: AttributeError: 'ImageData' object has no attribute 'get_item'
Started reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%..
Finished reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
Message received with ID: 5
Input stream has terminated
Message received with ID: 5
Input stream has terminated
WARNING:root:C backend is only for arrays of datatype float32 - defaulting to `numpy` backend
ok
test_TVdenoisingMR (test_SIRF.TestGradientMR_2D) ... Started reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%..
Finished reading acquisitions from /opt/SIRF-SuperBuild/INSTALL/share/SIRF-3.5/data/examples/MR/simulated_MR_2D_cartesian.h5
Message received with ID: 5
Input stream has terminated
Message received with ID: 5
Input stream has terminated
WARNING:root:C backend is only for arrays of datatype float32 - defaulting to `numpy` backend
...
WARNING:root:C backend is only for arrays of datatype float32 - defaulting to `numpy` backend
ok
test_Gradient (test_SIRF.TestGradientPET_2D) ... ok
test_Gradient (test_SIRF.TestGradientPET_3D) ... ok
test_BlockDataContainer_with_SIRF_DataContainer_add (test_SIRF.TestSIRFCILIntegration) ... ERROR
test_BlockDataContainer_with_SIRF_DataContainer_divide (test_SIRF.TestSIRFCILIntegration) ... ERROR
test_BlockDataContainer_with_SIRF_DataContainer_multiply (test_SIRF.TestSIRFCILIntegration) ... ERROR
test_BlockDataContainer_with_SIRF_DataContainer_subtract (test_SIRF.TestSIRFCILIntegration) ... ERROR
======================================================================
ERROR: test_BlockDataContainer_with_SIRF_DataContainer_add (test_SIRF.TestSIRFCILIntegration)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/cil_sirf_test/test_SIRF.py", line 343, in test_BlockDataContainer_with_SIRF_DataContainer_add
self.assertBlockDataContainerEqual(bdc , bdc1)
File "/cil_sirf_test/testclass.py", line 40, in assertBlockDataContainerEqual
self.assertBlockDataContainerEqual(container1.get_item(col),container2.get_item(col))
File "/cil_sirf_test/testclass.py", line 34, in assertBlockDataContainerEqual
if issubclass(container1.get_item(col).__class__, DataContainer):
AttributeError: 'ImageData' object has no attribute 'get_item'
======================================================================
ERROR: test_BlockDataContainer_with_SIRF_DataContainer_divide (test_SIRF.TestSIRFCILIntegration)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/cil_sirf_test/test_SIRF.py", line 298, in test_BlockDataContainer_with_SIRF_DataContainer_divide
self.assertBlockDataContainerEqual(bdc , bdc1)
File "/cil_sirf_test/testclass.py", line 40, in assertBlockDataContainerEqual
self.assertBlockDataContainerEqual(container1.get_item(col),container2.get_item(col))
File "/cil_sirf_test/testclass.py", line 34, in assertBlockDataContainerEqual
if issubclass(container1.get_item(col).__class__, DataContainer):
AttributeError: 'ImageData' object has no attribute 'get_item'
======================================================================
ERROR: test_BlockDataContainer_with_SIRF_DataContainer_multiply (test_SIRF.TestSIRFCILIntegration)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/cil_sirf_test/test_SIRF.py", line 319, in test_BlockDataContainer_with_SIRF_DataContainer_multiply
self.assertBlockDataContainerEqual(bdc , bdc1)
File "/cil_sirf_test/testclass.py", line 40, in assertBlockDataContainerEqual
self.assertBlockDataContainerEqual(container1.get_item(col),container2.get_item(col))
File "/cil_sirf_test/testclass.py", line 34, in assertBlockDataContainerEqual
if issubclass(container1.get_item(col).__class__, DataContainer):
AttributeError: 'ImageData' object has no attribute 'get_item'
======================================================================
ERROR: test_BlockDataContainer_with_SIRF_DataContainer_subtract (test_SIRF.TestSIRFCILIntegration)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/cil_sirf_test/test_SIRF.py", line 363, in test_BlockDataContainer_with_SIRF_DataContainer_subtract
self.assertBlockDataContainerEqual(bdc , bdc1)
File "/cil_sirf_test/testclass.py", line 40, in assertBlockDataContainerEqual
self.assertBlockDataContainerEqual(container1.get_item(col),container2.get_item(col))
File "/cil_sirf_test/testclass.py", line 34, in assertBlockDataContainerEqual
if issubclass(container1.get_item(col).__class__, DataContainer):
AttributeError: 'ImageData' object has no attribute 'get_item'
----------------------------------------------------------------------
Ran 8 tests in 35.843s
FAILED (errors=4)
KMN
I had to look that abbrev up (although I guessed correctly)!
some of these seem to tie up with the experience of https://github.com/SyneRBI/SIRF-SuperBuild/issues/827#issuecomment-1636794782 and other difficulties experienced there.
No idea about the get_item
problem
The get_item
problem has been fixed as of https://github.com/TomographicImaging/CIL/pull/1481 or better https://github.com/TomographicImaging/CIL/pull/1502 which means that the minimum commit of CIL that you can use is 0ba2f8e2935db66ec3dfe8c88e407e8d0eb5609f
which is exactly the hash that is in
https://github.com/SyneRBI/SIRF-SuperBuild/blob/092f912fd76d36cd3995671809528db4c7cc1f33/version_config.cmake#L151
Notice that the bug is in test_SIRF.py
rather than in CIL itself. This mess will disappear once we tag CIL.
You mean the CIL version installed (v23.0.1
) has nothing to do with the version specified in version_config.cmake
(0ba2f8e2
)? Then what's the point of the latter?
Right, that'll be then because we get the conda
version of CIL on docker
. So, a few choices:
BUILD_CIL=ON
and somehow prevent to install CIL via conda on top (i.e. don't use SIRF-Exercises/environment.yml
at all, or edit it first.). This is an interesting problem in itself: if you build a package yourself, how do you prevent conda
messing it up.which one do you prefer?
You mean the CIL version installed (
v23.0.1
) has nothing to do with the version specified inversion_config.cmake
(0ba2f8e2
)? Then what's the point of the latter?
see 2nd bullet above. version_config.cmake
is used when using the SB with BUILD_CIL=ON
. Not when doing conda install cil
- give up on this test for 3.5
- set
BUILD_CIL=ON
and somehow prevent to install CIL via conda on top (i.e. don't useSIRF-Exercises/environment.yml
at all, or edit it first.). This is an interesting problem in itself: if you build a package yourself, how do you preventconda
messing it up.- make a CIL-dev conda version
This exists, the install command is about conda install cil -c ccpi/label/dev
- release a new version of CIL
A new version of CIL should be out by the 27/9/23.
IMHO all this mess will go away once we tag and release CIL, so I suggest to wait until then.
ok. Of course, we will have the same problem later. I think what this is saying (no surprise): either use conda
or use the SuperBuild
, but don't try to mix.
long (cough, mid)-term solution would be pyproject.toml
+ https://github.com/scikit-build/scikit-build... no need for users to know about SuperBuild, CMake, or conda at all^1^3 😈 and no rebuilding already-existing packages :)
I have no idea really, but looks interesting.
We started with the SB because lots of things were not available on various systems (Gadgetron, ISMRMD, STIR, ...). That is less and less the case (although most of these are on conda only). Another reason was to have tight version control. I'm not sure how scikit-build would resolve external dependencies.
I'm happy to go with whatever makes our life easier (but serves the needs).
This needs a serious discussion of course.
fyi this PR is blocked on https://github.com/TomographicImaging/CIL/issues/1509 :)
that' s a long checklist!
for easy reference
ERROR: test_TVdenoisingMR (test_SIRF.TestGradientMR_2D)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/sirfuser/cil_sirf_test/test_SIRF.py", line 255, in test_TVdenoisingMR
res1 = TV.proximal(self.image1, tau=1.0)
File "/opt/conda/lib/python3.10/site-packages/cil/optimisation/functions/TotalVariation.py", line 271, in proximal
solution = self._fista_on_dual_rof(x, tau)
File "/opt/conda/lib/python3.10/site-packages/cil/optimisation/functions/TotalVariation.py", line 303, in _fista_on_dual_rof
p1 = self.gradient.range_geometry().allocate(None)
File "/opt/conda/lib/python3.10/site-packages/cil/framework/BlockGeometry.py", line 56, in allocate
containers = [geom.allocate(value, **kwargs) for geom in self.geometries]
File "/opt/conda/lib/python3.10/site-packages/cil/framework/BlockGeometry.py", line 56, in <listcomp>
containers = [geom.allocate(value, **kwargs) for geom in self.geometries]
File "/opt/SIRF-SuperBuild/INSTALL/python/sirf/Gadgetron.py", line 526, in allocate
tmp = value * numpy.ones(out.as_array().shape)
TypeError: unsupported operand type(s) for *: 'NoneType' and 'float'
Relevant lines are https://github.com/SyneRBI/SIRF/blob/d3956345ec6e56421017a6aca34c73747dbae664/src/xGadgetron/pGadgetron/Gadgetron.py#L508-L527
Indeed, the argument None
is not supported by Gadgetron.ImageData.allocate
. I've created https://github.com/SyneRBI/SIRF/issues/1212.
Do we need this for AcquisitionData
as well?
This seems to need a SIRF update then... (which means branching off from 3.5 to release 3.5.1).
This seems to need a SIRF update then... (which means branching off from 3.5 to release 3.5.1).
Let's then proceed with tagging the SB with 3.5 without this PR and tackle this later.
BTW, we are writing a list of residual incompatibilities between SIRF and CIL. We added to CIL a number of hacks, which we would like to remove, to overcome these incompatibilities, https://github.com/TomographicImaging/CIL/discussions/1516
Let's then proceed with tagging the SB with 3.5 without this PR and tackle this later.
I don't know... I guess you'd want the freshly released CIL in our version_config.cmake
. That'd presumably make GHA fail (or at least, I'd hope it would. TBH, I don't understand why it didn't, unless the current CIL tag didn't include that test).
We could leave the CIL version as-is and then release 3.5.1 soon after which would presumably
BUILD_CIL=ON
in docker, and remove using the SIRF- exercises environment.yml
.In any case, these steps seems minor (aside from probably the last one...) and need to be completed ASAP. If they are indeed minor, then releasing 3.5.0 as-is might be double work. Up to you.
Of course, there are 2 issues on our milestone list that do need to be resolved. https://github.com/SyneRBI/SIRF-SuperBuild/milestone/14
ok so
should be gewd for SIRF 3.5.0?
diff: https://github.com/SyneRBI/SIRF-SuperBuild/pull/838/files?w=1