Modifications for psmr2024

NikEfth commented 4 months ago

Changes in this pull request

Testing performed

Related issues

Checklist before requesting a review

[] I have performed a self-review of my code
[] I have added docstrings/doxygen in line with the guidance in the developer guide
[] I have implemented unit tests that cover any new or modified functionality (if applicable)
[] The code builds and runs on my machine
[] documentation/release_XXX.md has been updated with any functionality change (if applicable)

NikEfth commented 4 months ago

If this commit breaks many tests, I would consider it expected.

KrisThielemans commented 4 months ago

tests fail with

ERROR: Non-TOF data with inconsistent Time-of-Flight bin number - Aborted operation.

NikEfth commented 4 months ago

We can turn this to a warning, for now. Or, drop the old nonTOF altogether. But it will break people reading old headers.

NikEfth commented 4 months ago

A few more comments. Please merge master as CI is currently not running due to a conflict.

I hadn't noticed that ...

KrisThielemans commented 4 months ago

ah well. ctests are still failing. I cannot look at this now though.

NikEfth commented 4 months ago

ah well. ctests are still failing. I cannot look at this now though.

I bet it has do to with the new TOF behavior.

NikEfth commented 4 months ago

The new utility still needs about a 2 hour to finish the conversion for the demo scanner. It is an improvement.

NikEfth commented 4 months ago

I need to get some sleep. Now, we can load the fanProjdata from the disk. I also added some preliminary support for the LM cache loading.

NikEfth commented 4 months ago

My test_time_of_flights pass successfully. I made some small changes hopefully the CI is happy now.

KrisThielemans commented 4 months ago

Sadly ./run_test_listmode_recon.sh is failing on the TOF data. Not sure yet why.

KrisThielemans commented 4 months ago

error is


INFO: Computing sensitivity

INFO: Calculating sensitivity for subset 0

ERROR: TOF code for sensitivity needs work
terminate called after throwing an instance of 'std::runtime_error'
  what():  TOF code for sensitivity needs work

NikEfth commented 4 months ago

Now, I think old nonTOF Scanner templates will not work anymore, as the coincidence window defaults to something non-sensical.

KrisThielemans commented 4 months ago

This is a TOF test using the supplied TOF .hroot.

NikEfth commented 4 months ago

I just ran it and worked fine.


=== Generate attenuation image

WARNING: image duration not set, so time frame definitions will not be initialised

INFO: Processing next shape...
=== Calculate ACFs
=== Creating my_test_lm_frame.fdef (time frame definitions)
=============== Using frame_nonTOF definition =============
=== Unlist listmode data (for comparison)
---- Executable ran ok
=== Create additive sino my_additive_sinogram_frame_nonTOF.hs
---- Executable ran ok
=== Reconstruct listmode data without cache
---- Executable ran ok
=== Reconstruct listmode data with cache and store it on disk
---- Executable ran ok
=== Compare reconstructed images with and without caching LM file
---- This test seems to be ok !
=== Reconstruct listmode data with cache loaded from the disk
---- Executable ran ok
=== Compare reconstructed images without caching LM file and with loading cache from disk
---- This test seems to be ok !
=== Reconstruct projection data for comparison
---- Executable ran ok
=== Compare sensitivity images
---- This test seems to be ok !
=== Compare reconstructed images
---- This test seems to be ok !
=============== Using counts_nonTOF definition =============
=== Unlist listmode data (for comparison)
---- Executable ran ok
=== Create additive sino my_additive_sinogram_counts_nonTOF.hs
---- Executable ran ok
=== Reconstruct listmode data without cache
---- Executable ran ok
=== Reconstruct listmode data with cache and store it on disk
---- Executable ran ok
=== Compare reconstructed images with and without caching LM file
---- This test seems to be ok !
=== Reconstruct listmode data with cache loaded from the disk
---- Executable ran ok
=== Compare reconstructed images without caching LM file and with loading cache from disk
---- This test seems to be ok !
=== Reconstruct projection data for comparison
---- Executable ran ok
=== Compare sensitivity images
---- This test seems to be ok !
=== Compare reconstructed images
---- This test seems to be ok !

============= TOF tests using root_header.hroot ==========

=== Generate attenuation image

WARNING: FactoryRegistry:: overwriting previous value of key in registry.
     key: None

WARNING: image duration not set, so time frame definitions will not be initialised

INFO: Processing next shape...
=== Calculate ACFs
=== Creating my_test_lm_frame.fdef (time frame definitions)
=============== Using frame_TOF definition =============
=== Unlist listmode data (for comparison)
---- Executable ran ok
=== Create additive sino my_additive_sinogram_frame_TOF.hs
---- Executable ran ok
=== Reconstruct listmode data without cache
---- Executable ran ok
=== Reconstruct listmode data with cache and store it on disk
---- Executable ran ok
=== Compare reconstructed images with and without caching LM file
---- This test seems to be ok !
=== Reconstruct listmode data with cache loaded from the disk
---- Executable ran ok
=== Compare reconstructed images without caching LM file and with loading cache from disk
---- This test seems to be ok !
=== Reconstruct projection data for comparison
---- Executable ran ok
=== Compare sensitivity images
---- This test seems to be ok !
=== Compare reconstructed images
---- This test seems to be ok !
=============== Using counts_TOF definition =============
=== Unlist listmode data (for comparison)
---- Executable ran ok
=== Create additive sino my_additive_sinogram_counts_TOF.hs
---- Executable ran ok
=== Reconstruct listmode data without cache
---- Executable ran ok
=== Reconstruct listmode data with cache and store it on disk
---- Executable ran ok
=== Compare reconstructed images with and without caching LM file
---- This test seems to be ok !
=== Reconstruct listmode data with cache loaded from the disk
---- Executable ran ok
=== Compare reconstructed images without caching LM file and with loading cache from disk
---- This test seems to be ok !
=== Reconstruct projection data for comparison
---- Executable ran ok
=== Compare sensitivity images
---- This test seems to be ok !
=== Compare reconstructed images
---- This test seems to be ok !

--------------- End of tests -------------

Everything seems to be fine !
You could remove all generated files using "rm -f my_* *.log"

KrisThielemans commented 4 months ago

I just ran it and worked fine.

not on GitHub Actions, and not for me. Debugging now.

NikEfth commented 4 months ago

Can you try adding TOF mashing factor:= 1 in the root_header. Maybe it will help

KrisThielemans commented 4 months ago

I was a bit behind. It works now! Great.

GHA now has a segfault in find_ML_normfactors3D test.

KrisThielemans commented 4 months ago

@NikEfth I've pushed 2 parallelisations using reduction, and also some fixes with do_block. I think for reading with apply, we need to write all 1s for the blocks, so I do that now (but don't calculate anything I hope).

Note that the Array::sum() parallelisatoins are straightforward, but not necessarily great. It's going to parallelise at every "dimensions" of the array, and therefore the number of threads could quickly explode.

KrisThielemans commented 4 months ago

@NikEfth CI seems to be going well. I've pushed a few more parallelisations. Hopefully they help. Somewhat surprisingly, with g++-11 I seem to be able to do far more omp atomic than before. This could be a lot faster than a critical section (hardware lock normally), and can be used in many more places that are otherwise hard. Question is if you need it of course, so timing will be crucial to check.

Unfortunately, the apply currently uses make_fan_data and set_fan_data. I guess you're saying that these are very slow. (I don't quite understand why though). Possibly here atomic could help.

NikEfth commented 4 months ago

Ok, the sum is no longer a problem. One trick fits all .... :D

NikEfth commented 4 months ago

One of our actions earlier, to make the test pass, broke the new conversion. Now TOF single bin is never activated.

KrisThielemans commented 4 months ago

One of our actions earlier, to make the test pass, broke the new conversion. Now TOF single bin is never activated.

This seems to be working fine for me. I've done the following experiment:

edit https://github.com/UCL/STIR/blob/6b882344b4d5ebf5da8c78dfc44f47244cc02b2c/recon_test_pack/root_header.hroot#L15-L16 with 3 settings for size of TOF bin/timing res
- 820/400 (original)
- 120/400
- 120/40
run OSMAPOSL with LM objfun to output sensitivity files

I'll have to think very hard if this is what I expected, but it certainly shows that it gets into the relevant code path, so that's great.

KrisThielemans commented 4 months ago

good catch. Another case for getting rid of this special handling relying on the generic one...

KrisThielemans commented 4 months ago

@markus-jehl you're probably not able to, but in case you can, this PR should speed up the ML_norm code a fair amount, so you could potentially try that. It'd be good to have some reassurance on real data, as opposed to our simple test.

KrisThielemans commented 4 months ago

Ubuntu clang job fails with

usr/bin/ld: CMakeFiles/test_BSplinesRegularGrid.dir/test_BSplinesRegularGrid.cxx.o: in function `.omp_outlined.':
test_BSplinesRegularGrid.cxx:(.text+0x39bf): undefined reference to `__atomic_load'
/usr/bin/ld: test_BSplinesRegularGrid.cxx:(.text+0x39f3): undefined reference to `__atomic_compare_exchange'
clang: error: linker command failed with exit code 1 (use -v to see invocation)usr/bin/ld: CMakeFiles/test_BSplinesRegularGrid.dir/test_BSplinesRegularGrid.cxx.o: in function `.omp_outlined.':
test_BSplinesRegularGrid.cxx:(.text+0x39bf): undefined reference to `__atomic_load'
/usr/bin/ld: test_BSplinesRegularGrid.cxx:(.text+0x39f3): undefined reference to `__atomic_compare_exchange'
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Potentially related to https://gitlab.kitware.com/cmake/cmake/-/issues/23021 but it's weird tst_BSplines links ok.

KrisThielemans commented 4 months ago

test_ML_norm segfaults in the MacOS Debug job

KrisThielemans commented 3 months ago

Potentially related to https://gitlab.kitware.com/cmake/cmake/-/issues/23021 but it's weird tst_BSplines links ok.

Now https://gitlab.kitware.com/cmake/cmake/-/issues/26037

KrisThielemans commented 3 months ago

Linking problem was fixed in #1449 which has taken the sum/find_min/max parallelisation commits, and is now merged to master, so I've merged that back on here.

I suggest we split this PR in 2: one for the sensitivity calculation, and one for the parallelisation of the norm code. We can do that again by cherry-picking relevant commits. Maybe we can resolve things here first though, as you prefer.

UCL / STIR