Closed evgueni-ovtchinnikov closed 1 month ago
Thanks, this already looks usable (with a caveat for the "accumulation"). Is this in a state that @gschramm can already try?
We still have #1251, which is due to the fact that PoissonLogLikelihoodWithLinearModelForMeanAndListModeDataWithProjMatrixByBin
does not override empty ObjectiveFunction.get_subset_sensitivity()
. I am afraid I cannot be of much help here.
@KrisThielemans @evgueni-ovtchinnikov Just implemented a short test script for the PoissonLogL .accumulate_Hessian_times_input()
here.
current_estimate = ref_recon.copy()
input_img = acq_data.create_uniform_image(value=1, xy=nxny)
np.random.seed(0)
input_img.fill(np.random.rand(*input_img.shape) * (obj_fun.get_subset_sensitivity(0).as_array() > 0) * current_estimate.max())
hess_out_img = acq_data.create_uniform_image(value=1.0, xy=nxny)
obj_fun.accumulate_Hessian_times_input(current_estimate, input_img, subset=0, out=hess_out_img)
hess_out_img2 = obj_fun.accumulate_Hessian_times_input(current_estimate, input_img, subset=0)
fig, ax = plt.subplots(1, 4, figsize=(16, 4), tight_layout=True)
ax[0].imshow(current_estimate.as_array()[71, :, :], cmap = 'Greys')
ax[1].imshow(input_img.as_array()[71, :, :], cmap = 'Greys')
ax[2].imshow(hess_out_img.as_array()[71, :, :], cmap = 'Greys')
ax[3].imshow(hess_out_img2.as_array()[71, :, :], cmap = 'Greys')
ax[0].set_title('current estimate', fontsize = 'medium')
ax[1].set_title('input', fontsize = 'medium')
ax[2].set_title('hess_out_img', fontsize = 'medium')
ax[3].set_title('hess_out_img2', fontsize = 'medium')
fig.show()
Unfortunately, the output of both calls (without and with using the out kwarg), don't seem to work.
hess_out_img
is still an image full of ones and hess_out_img2
seems to be the same as input_img
.
These results used: STIR commit feb6d85eadb392f5b8278d3b97ae2ee67ca439d9 (HEAD, origin/master, origin/HEAD, master) Author: Kris Thielemans k.thielemans@ucl.ac.uk Date: Thu May 9 23:39:17 2024 +0100 SIRF commit 66c35c766694fe7258406b8e85177bfc0e7c9b24 (HEAD, origin/acc-hess) Author: Evgueni Ovtchinnikov evgueni.ovtchinnikov@stfc.ac.uk Date: Sat May 11 07:35:35 2024 +0000
This is a test with protection data, correct?
Indeed. The obj_fun
is
obj_fun = sirf.STIR.make_Poisson_loglikelihood(acq_data)
@gschramm could you try again?
Looks much better now. Tmr morning I will test whether we images are what we expect from the formulas.
Could work for listmode as well
@KrisThielemans I just compared against manually computing the "Hessian multiply" calling the fwd / back projections in SIRF. The result is very close in the foreground (where the current estimate is >> 0), but not super close in the background (where the current estimate is close to 0). Any ideas why that is?
hess_out_img = obj_fun.accumulate_Hessian_times_input(current_estimate, input_img, subset=0)
# %%
# calcuate the the Hessian multiply "manually"
acq_model.set_up(acq_data, initial_image)
acq_model.num_subsets = num_subsets
acq_model.subset_num = 0
# get the linear (Ax) part of the Ax + b affine acq. model
lin_acq_model = acq_model.get_linear_acquisition_model()
lin_acq_model.num_subsets = num_subsets
lin_acq_model.subset_num = 0
# for the Hessian "multiply" we need the linear part of the acquisition model applied to the input image
input_img_fwd = lin_acq_model.forward(input_img)
current_estimate_fwd = acq_model.forward(current_estimate)
h = -acq_model.backward(acq_data*input_img_fwd / (current_estimate_fwd*current_estimate_fwd))
The Hessian outputs are shown using a "narrow" color window (top) and a wide color window (bottom) to show the differences in foreground and background.
hmmm. We try to cancel singularities in the division https://github.com/UCL/STIR/blob/feb6d85eadb392f5b8278d3b97ae2ee67ca439d9/src/recon_buildblock/PoissonLogLikelihoodWithLinearModelForMeanAndProjData.cxx#L1128 with STIR's divide_and_truncate function. However, possibly that goes a bit wrong, as it was designed for (measured / estimated), while here it is used for (fwd * measured / estimated^2), so maybe thresholds are a bit off.
Of course, ideally there wouldn't be any singularities. Is your background strictly positive?
that's encouraging.
can you try your manual with some handling of 0/0 or so? A bit surprising it didn't generate NaNs then I guess. Easiest is to add an eps to the denominator.
I'll flip the sign of the LM Hessian. 😊
adding an eps = 1e-8 in the denominator of the ratio to be back projected in the Hessian does not change much - still much closer to the LM Hessian applied.
ah well, this looks like a STIR issue then for the Hessian with projection data. Best to open an issue there.
@evgueni-ovtchinnikov can you then add multiply_with_Hessian
? (Can just as well leave the accumulate version in I guess)
@gschramm what computation times are you getting?
multiply_with_Hessian
: @KrisThielemans which version/branch of STIR should I use? I have rel_6.0.0
on my latest VMs.
@gschramm what computation times are you getting?
Using 21 subsets, mMR LM file with 4,332,816 counts and acq_model = sirf.STIR.AcquisitionModelUsingRayTracingMatrix()
and 1 tangential LOR:
In [5]: %timeit hess_out_img = obj_fun.accumulate_Hessian_times_input(current_estimate, input_img, subset=0)
2.7 s ± 45.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [6]: %timeit hess_out_img_lm = lm_obj_fun.accumulate_Hessian_times_input(current_estimate, input_img, subset=0)
9.43 s ± 249 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
multiply_with_Hessian
: @KrisThielemans which version/branch of STIR should I use? I haverel_6.0.0
on my latest VMs.
you can use 6.0.0. Just add an extra member (that calls accumulate
) and remove the fill
in `accumulate.
Thanks @gschramm would you mind adding the timings in the STIR PR? Ideally also add gradient timings. I'd be interested as well in seeing what it does if num_subsets
decreases (i.e. is the listmode calculations slowing us down, or the reading). Of course, you could also enable the lm_obj_fun
caching, but this is going beyond what you care about. Thanks a lot for all your help.
(1) Will add all timings to the PR in a second. Does it make sense to test with sirf.STIR.AcquisitionModelUsingParallelproj()
?
I remember you mentioned that this is not supported yet for LM?
(2) Is is possible to save the computed sensitivity images to disk to speed up the run time if users want to execute the script multiple times?
1) No parallelproj
yet. There's a difficulty that with the proj-matrix, I re-use the matrix-row for all calculations. That won't be possible with parallproj
. So, we'll need new functions. Not sure if I can still do that.
2) In STIR, we can, but it looks like this isn't enabled yet in SIRF. @evgueni-ovtchinnikov do we expose PoissonLogLikelihoodWithLinearModelForMean::set_subset_sensitivity_filenames
? It should call
//! set filename pattern to read (or write) the subset sensitivities
/*! set to a zero-length string to avoid reading/writing a file
Could be e.g. "subsens_%d.hv"
boost::format is used with the pattern (which means you can use it like sprintf)
Calls error() if the pattern is invalid.
*/
void set_subsensitivity_filenames(const std::string&);
I see https://github.com/SyneRBI/SIRF/blob/1d53570cff96acb33734ac1da15b94bc43373d92/src/xSTIR/cSTIR/cstir_p.cpp#L933-L934 but https://github.com/SyneRBI/SIRF/blob/1d53570cff96acb33734ac1da15b94bc43373d92/src/xSTIR/cSTIR/cstir_p.cpp#L933 In any case, we need the subset version ATM.
If we have this, the usage is a bit awkward: first run as normal. then use set_recompute_sensitivity(false)
, in which case it will go and find them.
@KrisThielemans Let's leave LM acq.model with Parallelproj for later
Timing results of Sino/LM gradients/hessian multiplies for different number of subsets
mMR LM file with 4,332,816 counts and acq_model = sirf.STIR.AcquisitionModelUsingRayTracingMatrix()
and 1 tangential LOR, sirf.STIR.AcquisitionData.set_storage_scheme("memory")
7 subsets
In [14]: %timeit g = obj_fun.gradient(initial_image, subset = 0)
4.19 s ± 36.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [15]: %timeit hess_out_img = obj_fun.accumulate_Hessian_times_input(current_estimate, input_img, subset=0)
4.35 s ± 85.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [18]: %timeit g = lm_obj_fun.gradient(initial_image, subset=0)
4.18 s ± 13.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [19]: %timeit hess_out_img_lm = lm_obj_fun.accumulate_Hessian_times_input(current_estimate, input_img, subset=0)
4.13 s ± 46.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
14 subsets
In [22]: %timeit g = obj_fun.gradient(initial_image, subset = 0)
2.75 s ± 45.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [23]: %timeit hess_out_img = obj_fun.accumulate_Hessian_times_input(current_estimate, input_img, subset=0)
3.07 s ± 39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [26]: %timeit g = lm_obj_fun.gradient(initial_image, subset=0)
4.67 s ± 112 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [27]: %timeit hess_out_img_lm = lm_obj_fun.accumulate_Hessian_times_input(current_estimate, input_img, subset=0)
4.89 s ± 68.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
21 subsets
In [32]: %timeit g = obj_fun.gradient(initial_image, subset = 0)
2.35 s ± 37.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [33]: %timeit hess_out_img = obj_fun.accumulate_Hessian_times_input(current_estimate, input_img, subset=0)
2.69 s ± 58 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [34]: %timeit g = lm_obj_fun.gradient(initial_image, subset=0)
6.35 s ± 199 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [35]: %timeit g = lm_obj_fun.gradient(initial_image, subset=0)
6.14 s ± 53.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
CPU info
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 43 bits physical, 48 bits virtual
CPU(s): 256
On-line CPU(s) list: 0-255
Thread(s) per core: 2
Core(s) per socket: 64
Socket(s): 2
NUMA node(s): 2
Vendor ID: AuthenticAMD
CPU family: 23
Model: 49
Model name: AMD EPYC 7662 64-Core Processor
Stepping: 0
Frequency boost: enabled
CPU MHz: 1499.555
CPU max MHz: 2000.0000
CPU min MHz: 1500.0000
BogoMIPS: 4000.00
Virtualization: AMD-V
L1d cache: 4 MiB
L1i cache: 4 MiB
L2 cache: 64 MiB
L3 cache: 512 MiB
NUMA node0 CPU(s): 0-63,128-191
NUMA node1 CPU(s): 64-127,192-255
@KrisThielemans @evgueni-ovtchinnikov is accumulate_Hessian_times_input
the final name? (I remember discussions on the naming) If possible, I'd like to finish my notebooks today.
@evgueni-ovtchinnikov is working on implementing https://github.com/SyneRBI/SIRF/issues/1244#issuecomment-2103346336. Hopefully this will be done soon. @evgueni-ovtchinnikov if you have trouble, please ask for help.
@evgueni-ovtchinnikov here's the STIR test I talked about. https://github.com/UCL/STIR/blob/90cff236be35628f447ded4d98a046d5c4e3b316/src/include/stir/recon_buildblock/test/ObjectiveFunctionTests.h#L169 Best to implement the name first though.
@evgueni-ovtchinnikov you seem to be pretty close. Can you summarise current situation?
@KrisThielemans @gschramm I believe this PR can be merged - any objections?
builds failed at Coveralls step - any idea anyone why all of a sudden?
Also, please comment out the Coveralls lines https://github.com/SyneRBI/SIRF/blob/d54c4550d1533132c6a281b6cf2159c8fd22f45c/.github/workflows/build-test.yml#L133C4-L147C32 You probably have to preserve indentation. Or leave that for a separate PR
@evgueni-ovtchinnikov test_Hessian
isn't a great test, as it just compares the norms, while it should compare the difference.
I'm modifying it and putting it in test_ObjectiveFunction.py
def test_Hessian(self, subset=-1, eps=1e-3):
"""Checks that grad(x + dx) - grad(x) is close to H(x)*dx
"""
x = self.image
dx = x.clone()
dx *= eps/dx.norm()
dx += eps/2
y = x + dx
gx = self.obj_fun.gradient(x, subset)
gy = self.obj_fun.gradient(y, subset)
dg = gy - gx
Hdx = self.obj_fun.multiply_with_Hessian(x, dx, subset)
norm = dg.norm()
q = (dg - Hdx).norm()/dg.norm()
print('norm of grad(x + dx) - grad(x): %f' % dg.norm())
print('norm of H(x)*dx: %f' % Hdx.norm())
print('relative difference: %f' % q)
numpy.testing.assert_array_equal(q,0)
However, I had to add a constant term to the acq_model as otherwise the test fails (essentially because the Hessian is ill-defined otherwise). I'll commit this soon.
@evgueni-ovtchinnikov I'm now adding a similar test to the priors. (Had to delete a subset
in the call). However, I'm confused by results. Does get_gradient(x)
modify the result? Looks like it for priors.
@KrisThielemans
Does get_gradient(x) modify the result? Looks like it for priors.
what result?
got distracted... Sorry,
I did push the test for the LogLikelihood. Works for me.
@KrisThielemans
Does get_gradient(x) modify the result? Looks like it for priors.
what result?
ignore that. The loop in the test is a bit hard to understand... Update coming soon.
ok. Should be all done now. Can you add a line to CHANGES.md and merge?
Changes in this pull request
Testing performed
Related issues
Fixes #1244, #1249
Checklist before requesting a review
Contribution Notes
Please read and adhere to the contribution guidelines.
Please tick the following: