Open NikEfth opened 4 months ago
If this commit breaks many tests, I would consider it expected.
tests fail with
ERROR: Non-TOF data with inconsistent Time-of-Flight bin number - Aborted operation.
We can turn this to a warning, for now. Or, drop the old nonTOF altogether. But it will break people reading old headers.
A few more comments. Please merge
master
as CI is currently not running due to a conflict.
I hadn't noticed that ...
ah well. ctests are still failing. I cannot look at this now though.
ah well. ctests are still failing. I cannot look at this now though.
I bet it has do to with the new TOF behavior.
The new utility still needs about a 2 hour to finish the conversion for the demo scanner. It is an improvement.
I need to get some sleep. Now, we can load the fanProjdata from the disk. I also added some preliminary support for the LM cache loading.
My test_time_of_flights pass successfully. I made some small changes hopefully the CI is happy now.
Sadly ./run_test_listmode_recon.sh
is failing on the TOF data. Not sure yet why.
error is
INFO: Computing sensitivity
INFO: Calculating sensitivity for subset 0
ERROR: TOF code for sensitivity needs work
terminate called after throwing an instance of 'std::runtime_error'
what(): TOF code for sensitivity needs work
Now, I think old nonTOF Scanner templates will not work anymore, as the coincidence window defaults to something non-sensical.
This is a TOF test using the supplied TOF .hroot.
I just ran it and worked fine.
=== Generate attenuation image
WARNING: image duration not set, so time frame definitions will not be initialised
INFO: Processing next shape...
=== Calculate ACFs
=== Creating my_test_lm_frame.fdef (time frame definitions)
=============== Using frame_nonTOF definition =============
=== Unlist listmode data (for comparison)
---- Executable ran ok
=== Create additive sino my_additive_sinogram_frame_nonTOF.hs
---- Executable ran ok
=== Reconstruct listmode data without cache
---- Executable ran ok
=== Reconstruct listmode data with cache and store it on disk
---- Executable ran ok
=== Compare reconstructed images with and without caching LM file
---- This test seems to be ok !
=== Reconstruct listmode data with cache loaded from the disk
---- Executable ran ok
=== Compare reconstructed images without caching LM file and with loading cache from disk
---- This test seems to be ok !
=== Reconstruct projection data for comparison
---- Executable ran ok
=== Compare sensitivity images
---- This test seems to be ok !
=== Compare reconstructed images
---- This test seems to be ok !
=============== Using counts_nonTOF definition =============
=== Unlist listmode data (for comparison)
---- Executable ran ok
=== Create additive sino my_additive_sinogram_counts_nonTOF.hs
---- Executable ran ok
=== Reconstruct listmode data without cache
---- Executable ran ok
=== Reconstruct listmode data with cache and store it on disk
---- Executable ran ok
=== Compare reconstructed images with and without caching LM file
---- This test seems to be ok !
=== Reconstruct listmode data with cache loaded from the disk
---- Executable ran ok
=== Compare reconstructed images without caching LM file and with loading cache from disk
---- This test seems to be ok !
=== Reconstruct projection data for comparison
---- Executable ran ok
=== Compare sensitivity images
---- This test seems to be ok !
=== Compare reconstructed images
---- This test seems to be ok !
============= TOF tests using root_header.hroot ==========
=== Generate attenuation image
WARNING: FactoryRegistry:: overwriting previous value of key in registry.
key: None
WARNING: image duration not set, so time frame definitions will not be initialised
INFO: Processing next shape...
=== Calculate ACFs
=== Creating my_test_lm_frame.fdef (time frame definitions)
=============== Using frame_TOF definition =============
=== Unlist listmode data (for comparison)
---- Executable ran ok
=== Create additive sino my_additive_sinogram_frame_TOF.hs
---- Executable ran ok
=== Reconstruct listmode data without cache
---- Executable ran ok
=== Reconstruct listmode data with cache and store it on disk
---- Executable ran ok
=== Compare reconstructed images with and without caching LM file
---- This test seems to be ok !
=== Reconstruct listmode data with cache loaded from the disk
---- Executable ran ok
=== Compare reconstructed images without caching LM file and with loading cache from disk
---- This test seems to be ok !
=== Reconstruct projection data for comparison
---- Executable ran ok
=== Compare sensitivity images
---- This test seems to be ok !
=== Compare reconstructed images
---- This test seems to be ok !
=============== Using counts_TOF definition =============
=== Unlist listmode data (for comparison)
---- Executable ran ok
=== Create additive sino my_additive_sinogram_counts_TOF.hs
---- Executable ran ok
=== Reconstruct listmode data without cache
---- Executable ran ok
=== Reconstruct listmode data with cache and store it on disk
---- Executable ran ok
=== Compare reconstructed images with and without caching LM file
---- This test seems to be ok !
=== Reconstruct listmode data with cache loaded from the disk
---- Executable ran ok
=== Compare reconstructed images without caching LM file and with loading cache from disk
---- This test seems to be ok !
=== Reconstruct projection data for comparison
---- Executable ran ok
=== Compare sensitivity images
---- This test seems to be ok !
=== Compare reconstructed images
---- This test seems to be ok !
--------------- End of tests -------------
Everything seems to be fine !
You could remove all generated files using "rm -f my_* *.log"
I just ran it and worked fine.
not on GitHub Actions, and not for me. Debugging now.
Can you try adding TOF mashing factor:= 1
in the root_header. Maybe it will help
I was a bit behind. It works now! Great.
GHA now has a segfault in find_ML_normfactors3D
test.
@NikEfth I've pushed 2 parallelisations using reduction, and also some fixes with do_block
. I think for reading with apply, we need to write all 1s for the blocks, so I do that now (but don't calculate anything I hope).
Note that the Array::sum()
parallelisatoins are straightforward, but not necessarily great. It's going to parallelise at every "dimensions" of the array, and therefore the number of threads could quickly explode.
@NikEfth CI seems to be going well. I've pushed a few more parallelisations. Hopefully they help. Somewhat surprisingly, with g++-11 I seem to be able to do far more omp atomic
than before. This could be a lot faster than a critical section (hardware lock normally), and can be used in many more places that are otherwise hard. Question is if you need it of course, so timing will be crucial to check.
Unfortunately, the apply
currently uses make_fan_data
and set_fan_data
. I guess you're saying that these are very slow. (I don't quite understand why though). Possibly here atomic
could help.
Ok, the sum is no longer a problem. One trick fits all .... :D
One of our actions earlier, to make the test pass, broke the new conversion. Now TOF single bin is never activated.
One of our actions earlier, to make the test pass, broke the new conversion. Now TOF single bin is never activated.
This seems to be working fine for me. I've done the following experiment:
I'll have to think very hard if this is what I expected, but it certainly shows that it gets into the relevant code path, so that's great.
good catch. Another case for getting rid of this special handling relying on the generic one...
@markus-jehl you're probably not able to, but in case you can, this PR should speed up the ML_norm code a fair amount, so you could potentially try that. It'd be good to have some reassurance on real data, as opposed to our simple test.
Ubuntu clang job fails with
usr/bin/ld: CMakeFiles/test_BSplinesRegularGrid.dir/test_BSplinesRegularGrid.cxx.o: in function `.omp_outlined.':
test_BSplinesRegularGrid.cxx:(.text+0x39bf): undefined reference to `__atomic_load'
/usr/bin/ld: test_BSplinesRegularGrid.cxx:(.text+0x39f3): undefined reference to `__atomic_compare_exchange'
clang: error: linker command failed with exit code 1 (use -v to see invocation)usr/bin/ld: CMakeFiles/test_BSplinesRegularGrid.dir/test_BSplinesRegularGrid.cxx.o: in function `.omp_outlined.':
test_BSplinesRegularGrid.cxx:(.text+0x39bf): undefined reference to `__atomic_load'
/usr/bin/ld: test_BSplinesRegularGrid.cxx:(.text+0x39f3): undefined reference to `__atomic_compare_exchange'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Potentially related to https://gitlab.kitware.com/cmake/cmake/-/issues/23021 but it's weird tst_BSplines
links ok.
test_ML_norm
segfaults in the MacOS Debug job
Potentially related to https://gitlab.kitware.com/cmake/cmake/-/issues/23021 but it's weird
tst_BSplines
links ok.
Linking problem was fixed in #1449 which has taken the sum/find_min/max parallelisation commits, and is now merged to master, so I've merged that back on here.
I suggest we split this PR in 2: one for the sensitivity calculation, and one for the parallelisation of the norm code. We can do that again by cherry-picking relevant commits. Maybe we can resolve things here first though, as you prefer.
Changes in this pull request
Testing performed
Related issues
Checklist before requesting a review
documentation/release_XXX.md
has been updated with any functionality change (if applicable)