Scatter estimation depends on scale of normalisation

markus-jehl commented 1 year ago

When scatter estimation is done with normalisation projection data with voxel values between 0 and 100, then it converges smoothly, as shown on this plot (left: axial sum of a line profile for the first 7 iterations of the scatter estimation; right: resulting scatter estimate for a cylindrical phantom with air insert).

When the same data is processed again, but this time the normalisation sinogram is scaled with a fixed scaling factor of 1e4, then the scatter estimation oscillates between two very different states:

robbietuk commented 1 year ago

This seems to be a "step size" issue. Can you share the script/parameter file used to perform the estimation and the log file?

markus-jehl commented 1 year ago

I agree, it looks a lot like a step size problem. I didn't find where the scatter estimation actually computes the step size - do you know where this happens?

This is the minimal python code that does the scatter estimation for this example (data can be downloaded here: https://positrigo-my.sharepoint.com/:u:/g/personal/markus_jehl_positrigo_onmicrosoft_com/EXC1Ir0jcg9JppP2Qjye45MBfIFu3giuMTY15eD72CO0KA?e=O4j3RR):

import stir

# define scanner geometry
detectors_per_ring = 256
number_of_rings = 48
arccorrected_bins = 180
inner_ring_radius_mm = 133
avg_interaction_depth_mm = 4.0
crystal_spacing_mm = 3.27
ring_spacing_mm = crystal_spacing_mm
bin_size_mm = crystal_spacing_mm / 2
intrinsic_tilt_deg = -0.38956
num_axial_blocks_per_bucket = 6
num_transaxial_blocks_per_bucket = 4
num_axial_crystals_per_block = 8
num_transaxial_crystals_per_block = 8
num_crystals_per_block = 1
num_detector_layers = 1
energy_resolution = 0.14
reference_energy = 511
scanner_orientation = "X"
axial_block_spacing_mm = 26.68
transaxial_block_spacing_mm = 26.68
scanner_blk = stir.Scanner(stir.Scanner.User_defined_scanner, "NeuroLF_Prototype", detectors_per_ring, number_of_rings, 
                           arccorrected_bins, arccorrected_bins, inner_ring_radius_mm, avg_interaction_depth_mm, ring_spacing_mm, bin_size_mm, 
                           intrinsic_tilt_deg, num_axial_blocks_per_bucket, num_transaxial_blocks_per_bucket, 
                           num_axial_crystals_per_block, num_transaxial_crystals_per_block, num_crystals_per_block, num_crystals_per_block, 
                           num_detector_layers, energy_resolution, reference_energy, "BlocksOnCylindrical", 
                           crystal_spacing_mm, crystal_spacing_mm, axial_block_spacing_mm, transaxial_block_spacing_mm)

# define projection settings
span = 1
max_ring_diff = 47
num_views = 128
num_tangential_pos = arccorrected_bins
do_arc_correction = False
tmpl_proj_data_info = stir.ProjDataInfo.construct_proj_data_info(scanner_blk, span, max_ring_diff, num_views, num_tangential_pos, do_arc_correction)

scatter_estimation = stir.ScatterEstimation()

# go through the humble list of settings to be defined
scatter_estimation.set_run_debug_mode(True)
scatter_estimation.set_input_proj_data_sptr(stir.ProjDataInMemory.read_from_file("minimal_working_example/input_proj_data.hs"))
scatter_estimation.set_attenuation_image_sptr(stir.FloatVoxelsOnCartesianGrid.read_from_file("minimal_working_example/attenuation_map.hv"))
scatter_estimation.set_normalisation_sptr(stir.BinNormalisationFromProjData(stir.ProjDataInMemory.read_from_file("minimal_working_example/large_normalisation.hs")))
scatter_estimation.set_attenuation_correction_proj_data_sptr(stir.ProjDataInMemory.read_from_file("minimal_working_example/attenuation_correction_factors.hs"))
scatter_estimation.set_background_proj_data_sptr(stir.ProjDataInMemory.read_from_file("minimal_working_example/randoms.hs"))

scatter_estimation.set_mask_projdata_filename("bla.hs")
scatter_estimation.set_mask_image_filename("bla.hv")
scatter_estimation.set_max_scale_value(1)
scatter_estimation.set_min_scale_value(1)
scatter_estimation.set_num_iterations(7)

scatter_estimation.set_output_scatter_estimate_prefix("intermediate_scatter_result")
scatter_estimation.set_export_scatter_estimates_of_each_iteration(True)

recon_type = stir.OSMAPOSLReconstruction3DFloat()
recon_type.set_num_subiterations(4)
recon_type.set_num_subsets(4)
recon_type.set_enforce_initial_positivity(True)
obj_func = stir.PoissonLogLikelihoodWithLinearModelForMeanAndProjData3DFloat()
recon_type.set_objective_function(obj_func)
recon_type.set_disable_output(True)
gaussian_filter = stir.SeparableGaussianImageFilter3DFloat()
gaussian_filter.set_fwhms(stir.Float3BasicCoordinate((15,15,15)))
recon_type.set_post_processor_sptr(gaussian_filter)
scatter_estimation.set_reconstruction_method_sptr(recon_type)

target_image = stir.FloatVoxelsOnCartesianGrid(tmpl_proj_data_info, 1);
zoom = stir.FloatCartesianCoordinate3D(1, 0.2, 0.2)
target_origin = stir.FloatCartesianCoordinate3D(0, 0, 0)
target_dimensions = stir.Int3BasicCoordinate((target_image.get_max_indices()[1] + 1,
                                            2 * int(target_image.get_max_indices()[2] * zoom.y()) + 1,
                                            2 * int(target_image.get_max_indices()[3] * zoom.x()) + 1))
stir.zoom_image_in_place(target_image, zoom, target_origin, target_dimensions)
target_image.fill(1)
scatter_estimation.set_initial_activity_image_sptr(target_image)

simulator = stir.SingleScatterSimulation()
scatter_estimation.set_scatter_simulation_method_sptr(simulator)

scatter_estimation.set_export_scatter_estimates_of_each_iteration(True)

scatter_estimation.set_up();
scatter_estimation.process_data()

scatter_estimate = scatter_estimation.get_output()
scatter_estimate.write_to_file("minimal_working_example/scatter_estimate.hs")

robbietuk commented 1 year ago

It isn't a gradient descent algorithm, but may be considered analogus. The scatter estimation iterations occur in ScatterEstimation.process_data(), in particular in upsample_and_fit_scatter_estimate. https://github.com/UCL/STIR/blob/f979fe3fa409f9b66079443156e40f8c7ddc1af3/src/scatter_buildblock/ScatterEstimation.cxx#L959-L964

This method should be applying an upper and lower bound on scale_factors if they are set.

https://github.com/UCL/STIR/blob/f979fe3fa409f9b66079443156e40f8c7ddc1af3/src/scatter_buildblock/upsample_and_fit_scatter_estimate.cxx#L93-L151

scatter_estimation.set_max_scale_value(1) scatter_estimation.set_min_scale_value(1)

It appears you have set these to 1 meaning there is no thresholding / lowpass filter being applied. Leaving these values might assist in reducing the fluctuations.

Additionally, if it doesn't work, please save the terminal output to file for review.

markus-jehl commented 1 year ago

My understanding is that these just scale the output of the single scatter simulation to match the actual observed scatter (in our case we used simulations to determine the ratio of single scatters to total scatter (close to 1, which is why I used 1 here). If I comment these two lines, I get more or less the same oscillations:

Logfile is attached. logfile.txt

markus-jehl commented 1 year ago

I tested the single scatter simulation and upsample_and_fit_scatter_estimate, and both give the same results for both scales of normalisation: the downsampled scatter estimate from single scatter simulation is exactly 10000 times larger for a SSS with 10000 larger emission image and normalisation. Then, after calling upsample_and_fit_scatter_estimate with either the normalisation around 1 or the normalisation around 10000, both scatter estimates are identical apart from floating point differences.

This points to the bug being in the iterative update steps in the scatter estimation.

markus-jehl commented 1 year ago

The latest suspicion was that it could have to do with hardcoded threshold values in this section of the code: https://github.com/UCL/STIR/blob/3b85e01cb2c66c0ec328fe708bfba1c42fe5afc5/src/scatter_buildblock/ScatterEstimation.cxx#L985 This was tested by modifying these values, but didn't change the behaviour fundamentally. Could it have something to do with switching between 2D and 3D normalisation projdata?

markus-jehl commented 1 year ago

Further investigation showed that the oscillatory behaviour is triggered by having randoms projdata which contain zeroes. If the randoms are set to something homogeneous or something heterogeneous without zeros, then the scatter estimation converges.

However, the scale of the final scatter estimate differs based on the magnitude of the normalisation. Using a homogeneous normalisation makes no difference for convergence or magnitude of converged value.

KrisThielemans commented 1 year ago

I still can't make sense of this. sorry. This latest observation makes it even weirder. If anything, I'd anticipated that with randoms = 0, everything would scale, while without it, it might not.

Did you try with all randoms 0 (and equivalently, not setting any background at all)?

markus-jehl commented 1 year ago

Not yet, let me try that now.

markus-jehl commented 1 year ago

Same, randoms all zero and input that only contains trues and scatters still gives oscillations for large normalisation.

robbietuk commented 1 year ago

There are so many moving parts with the scatter estimation it might be hard to identify the issue. I wonder, does this oscillating behavior occur with an OSEM reconstruction between sub iterations when the randoms are zero and normalization is scaled up?

markus-jehl commented 1 year ago

Good question! I just tested it and it converges smoothly as expected.

I suspect the problem is in the scatter estimation and that one of the many member variables is not correctly updated during the iterations. But so far I haven't managed to pinpoint the problem.

KrisThielemans commented 1 year ago

I'd like to see some more detail on the "zero randoms" case (as it is the weirdest, and simplest).

Would you be able to show profiles through reconstructed images as well as low-res and upscaled scatter estimates? We can do this in a meeting.

markus-jehl commented 1 year ago

The problem turned out to be two places in ScatterEstimation where the initial activity image was reset to 1. Therefore OSMAPOSL never had a chance to converge for large normalisations.

robbietuk commented 1 year ago

I assume this is because you are using 4 subsets and 4 iterations. Such a scheme rarely gives OSEM chance to reach a "decent" image. As Kris mentioned in https://github.com/UCL/STIR/pull/1160#issuecomment-1433143192, the mMR uses 21 subsets with 21 subiteration. I speculate that increasing the number of subsets and/or the number of sub iterations would alleviate this issue.

KrisThielemans commented 1 year ago

I assume this is because you are using 4 subsets and 4 iterations.

probably recommended by me, as I thought it continued...

More updates would decrease the effect of course, but it seems a bit of a waste of time.

worth plotting images obtained in each scatter iterations

NikEfth commented 1 year ago

I keep looking into this because I don't think that removing the initialization of the activity image is the correct way forward. So, I tried to replicate the issue unsuccessfully so far.

I did the following: A normal scatter estimation with acquired data from a head PET scanner and then I multiplied the normalization factors by 10^6 with stir_math. Nothing more really.

I checked the profiles but here I put only the number of events in the sinogram as the profiles would not contribute anything to the discussion.

(num of events in) Unscaled scatter sinogram with the default norm factors: 1.70659e+08, 1.51432e+08, 1.31647e+08, 1.30896e+08, 1.30884e+08 Scatter with the default norm factors: 3.43142e+08, 2.69167e+08, 2.3148e+08, 2.45585e+08, 2.39953e+08 Unscaled scatter sinogram with the scaled norm factors: 7.89108e+12, 6.01817e+12, 5.07285e+12, 5.42789e+12, 5.29831e+12 Scatter sinogram with the scaled norm factors: 3.41746e+08, 2.64443e+08, 2.3147e+08, 2.43826e+08, 2.38875e+08

The Unscaled sinogram reflects the change in the units of the activity images. I understand that. But after upsampling etc the scatter sinograms are very close to those with the original normalization factors. @markus-jehl Did you do anything else to see this behavior? I use the typical cylindrical geometry and 2d estimation.

KrisThielemans commented 1 year ago

@NikEfth, @markus-jehl was running with 4 subiterations (per scatter iter) I believe. Using only few subiterations makes the problem more obvious. How many are you running?

Also, I think you scaled the norm factors up, while @markus-jehl scaled it down. That'll make a difference as you will an image with large values, while he had small (and therefore convergence rate behaviour due to the background term will be different).

As @robbietuk mentioned, a benefit of course if not re-initialising the activity image as that you don't need that many updates.

@NikEfth I don't know why you don't like the relevant PR really, but I suggest you try your above experiment with scaling up and down, but woth master and the PR. If everything remains the same, there is no harm in merging 😄

markus-jehl commented 1 year ago

@NikEfth All this commit does is start the reconstruction with an appropriate initial image, thereby saving a bit of time. It shouldn't interfere with anything else. Like @KrisThielemans said, I was only using four subiterations in each scatter estimation iteration (under the assumption that the scatter and the activity estimation can converge side by side during the scatter iterations).

I have pushed a setting on the scatter estimation class now, so you can check out my branch and confirm you get the same result with the setting enabled and disabled. Would be good to get a confirmation that it doesn't break any of the code you rely on.

UCL / STIR

Scatter estimation depends on scale of normalisation #1112