Open paskino opened 1 year ago
Hi @paskino I had a play with adding an append function to blockdatacontainer, blockoperator and blockfunction. The code can be found here: https://github.com/MargaretDuff/CIL-margaret/tree/append-to-block and the testing file here: https://github.com/MargaretDuff/cil_test_notebooks/blob/master/append_block_operator_object/test_append_block_operator.py.
I know they are not very pretty implementations but let me know if you think their behaviour is what we are hoping for?
I haven't tried overwriting the add function yet, might not make sense for the blockoperator which has shape MxN and thus we need to be clear which axis we are adding to.
Had a quick chat with @paskino last week. Talked about just appending to the list of containers (block data container) or list of operators (block operator) or functions (block functions). I think the question is - do we want append to output a new object or edit the existing one? Note that numpy " append does not occur in-place: a new array is allocated and filled."
My preference would be to output a new object, similar to the behaviour of numpy
Good question. I think the use case we have considered so far, ie using a for loop to build up a blockoperator (blockfunction), it seems more natural to me to modify the existing one, starting from an initial empty blockoperator (-function).
On Mon, 11 Sept 2023 at 14:20, Margaret Duff @.***> wrote:
Had a quick chat with @paskino https://github.com/paskino last week. Talked about just appending to the list of containers (block data container) or list of operators (block operator) or functions (block functions). I think the question is - do we want append to output a new object or edit the existing one? Note that numpy " append https://numpy.org/doc/stable/reference/generated/numpy.append.html#numpy.append does not occur in-place: a new array is allocated and filled."
My preference would be to output a new object, similar to the behaviour of numpy
— Reply to this email directly, view it on GitHub https://github.com/TomographicImaging/CIL/issues/1494#issuecomment-1713769042, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACMDSCBRBBUKIO7CFJUOR7DXZ36ZVANCNFSM6AAAAAA3LJVGVU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
@jakobsj Good point - the case I was considering was for explicit PDHG with TV when we append the TV operator to the list of projection operators. Perhaps if we modify the existing one we call it something else, to avoid confusion with numpy append?
In the case of SPDHG, we can do something like
Ai = ListOperatorsCT(bi).get() # bi is a list of DataContainers
fi = [L2NormSquared(b=bi) for bi in bi]
Now, if we want implicit-SPDHG, we can do
f = BlockFunction(*fi)
K = BlockOperator(*Ai)
and if we want explicit-SPDHG, we append first and then wrap them into blocks(functions and operators)
fi.append(alpha*MixedL21Norm())
Ai.append(GradientOperator(ig))
The same applies when we want to use (proximal) gradient algorithms with Stochastic Estimators. For instance if we have $$min{x} \sum{i}^{n} f_{i}(x) + g(x)$$
In the case of SGD=GD + SGFunction
, we can do
fi = [LeastSquares(A=Ai, b=bi) + (1/n)*g for Ai,bi in zip(Ai, bi)]
f = SGFunction(fi)
and in the case of ProxSGD = PGA+SGFunction
, we can do
fi = [LeastSquares(A=Ai, b=bi) for Ai,bi in zip(Ai, bi)]
f = SGFunction(fi)
I think, most important is to provide a method/function for the user that will generate multiple instances of tomography operators. For CIL, this can be done in one line but for SIRF is more complicated. The appending
and the wrapping
is what we do in the MCIR paper and the CIL2
Thanks @epapoutsellis
In the case of SPDHG, we can do something like
Ai = ListOperatorsCT(bi).get() # bi is a list of DataContainers fi = [L2NormSquared(b=bi) for bi in bi]
Yes - I had a play with something similar to this in issues #1501. I think from a blockgeometry we can already extract a projection operator from partitioned data.
Now, if we want implicit-SPDHG, we can do
f = BlockFunction(*fi) K = BlockOperator(*Ai)
and if we want explicit-SPDHG, we append first and then wrap them into blocks(functions and operators)
fi.append(alpha*MixedL21Norm()) Ai.append(GradientOperator(ig))
Ah ok, in this issue we discuss appending two blocks (functions or operators) which under the hood append the underlying lists.
I think, most important is to provide a method/function for the user that will generate multiple instances of tomography operators. For CIL, this can be done in one line but for SIRF is more complicated. The
appending
and thewrapping
is what we do in the MCIR paper and the CIL2
I am not sure I completely understand this - could you explain more?
I have changed this so it now appends to the original object rather than producing a new object. For example: where both the image and the norm output zeros and the shapes agree.
Similarly blockoperator and blockfunction can be appended to:
The pipeline for the stochastic framework is the following:
1) Start with some acquisition data.
2) Split the data based on a splitting method. It can be a list of DataContainers
or a BlockDataContainer
.
3) Create list of functions and list of operators based on the splitted data.
4) Based on the choice of the algorithm, we wrap the lists above using some CIL API and give some mathematical meaning to these lists.
For the above steps, we need to consider the differences between CIL and SIRF as well as composite and non-composite optimisation. I have written some code blocks in the overleaf that I share for all the cases and CIL, SIRF.
For the case of SPDHG: Given the splitted data, we can create a list of ProjectionOperators
or if you wish a BlockOperator
Ais = [ProjectionOperator(ig, data.geometry, device="cpu") for data in splitted_data]
# A = BlockOperator(*Ais)
# Splitted data can be a list of a DataContainer (they are equivalent)
Now, the user will need to choose how to form the optimisation problem for the SPDHG algorithm.
In the case of explicit SPDHG:
Grad = GradientOperator(ig)
Ais.append(Grad)
Now, we need to fix our functions,
fis = [0.5*L2NormSquared(b=data) for data in splitted_data]
Since we are in explicit case, we need one more function for the regulariser.
fis.append(alpha*MixedL21Norm())
The above list of functions does not have any mathematical meaning, but if wrap them with a BlockFunction
it becomes a Separable Sum
which needs a BlockOperator
to form the standard $f(Kx)$ term. For the $G(x)$ we use an IndicatorBox
and we are ready to run SPDHG
.
In the case of implicit SPDHG:
We run the steps above but without the appending steps for Ais
and fis
and replace $G(x)=FGP_TV$.
For non composite optimization, i.e., $f(x) + g(x)$ where the operator is not exposed, we give to list of functions a different meaning. Now, we use (Proximal) Gradient Algorithms, e.g., GD, ISTA, FISTA and for the above problem we have
fis = [LeastSquares(A = A, b=data, c=0.5) for A, data in zip(Ais, splitted_data)]
Then, for Proximal-SGD:= ISTA + SGFunction, we do
f = SGFunction(fis)
with g(x)=FGP_TV
.
But, we can also solve a "similar" problem using a smooth version of the TotalVariation and run SGD (not proximal).
$$min{x} \sum{i=1}^{n}||A{i}x - b{i}||^{2} + \alpha ||\nabla x||_{\epsilon}$$
Grad = GradientOperator(ig)
reg = OperatorCompositionFunction((alpha/num_subsets)*SmoothMixedL21Norm(1e-5), Grad)
fi = [OperatorCompositionFunction(L2NormSquared(b=data), Ai) + reg for data, Ai, in zip(data_split1,Ai)]
sg_func = SGFunction(fi, selection)
sg_func.dask=True
step_size = 1.65e-06
sgd = GD(initial = initial, objective_function = sg_func, step_size = step_size,
max_iteration = epochs * num_subsets,
update_objective_interval = num_subsets)
sgd.run(verbose=1, callback=[cb])
If you replace the lists of operators and data above with BlockOperators and BlockDataContainers the code will continue to work. However, for the syntax here, I do not know the type of
F = L2NormSquared(b=data_p)
Is it a list of L2NormSquared instances or a BlockFunction
or a SumFunction
or a ApproximateGradientSumFunction
.
If we assume that is a BlockFunction
then we imply that we want to solve a composite opt problem and are good to go to run SPDHG, explicit and implicit depending on the appending. However, there is a fundamental problem with the CIL functions API. Our functions do not have domain. Our base loss functions act on DataContainers not on BlockDataContainers. In the above syntax, we imply that F
takes values from a product of ImageGeometries, i.e., a Cartesian Product which is a BlockGeometry.
Now, if I do the same for the LeastSquares
F = LeastSquares(A=BlockOperator, b=BlockDataContainer)
and assume that is a BlockFunction
then we cannot use it for Proximal Gradient type of Algorithms. We need to do
F = SGFunction(*F.functions)
which means that we go from a BlockFunction
(Separable Sum) to a SumFunction
(Finite Sum). There are other issues about properties for each functions such as smoothness, Lipschitz constants which I will not mention here.
In the case of SIRF, we need to split the actual counts, background and attenuation factors, maybe more @KrisThielemans knows more. However, SIRF does not have a .geometry
attribute, so we cannot do one line of code to generate a list of operators based on the list of data. So I created the following class:
class ConstructListOperatorsBase(ABC):
def __init__(self, list_data):
self.list_data = list_data
self.len_list = len(self.list_data)
self.operators = []
@abstractmethod
def get(self):
pass
class ListOperatorsPET(ConstructListOperatorsBase):
def __init__(self, list_data, list_attenuation_factors, image):
super(ListOperatorsPET, self).__init__(list_data)
self.list_attenuation_factors = list_attenuation_factors
self.image = image
def get(self):
for i in range(self.len_list):
acq_model = pet.AcquisitionModelUsingRayTracingMatrix()
acq_model.set_num_tangential_LORs(10)
asm_attn = pet.AcquisitionSensitivityModel(self.list_attenuation_factors[i])
acq_model.set_acquisition_sensitivity(asm_attn)
acq_model.set_up(self.list_data[i],self.image)
self.operators.append(acq_model)
return self.operators
So the user will run Ai = ListOperatorsPET(bi, attfs_i, image).get()
and get a list of operators. Now, depending on which algorithm we choose to solve a PET recon problem (with/without a regulariser) we can do
fi = [OperatorCompositionFunction(KullbackLeibler(b=bi, eta = bck_i), Ai)
for bi,bck_i,Ai in zip(bi, bck_i, Ai)]
for non-composite opt and
fi = [KullbackLeibler(b=bi, eta = bck_i) for bi, bck_i in zip(bi, bck_i)] # bi is a list of splitted data, bck_i is a list of splitted background
f = BlockFunction(*fi)
K = BlockOperator(*Ai)
for composite opt.
At least for the Stochastic Framework API, we should give a mathematical meaning to the list of functions only at the last step depending on the choice of algorithm and optimization problem. Note that there are algorithms which combine the two cases mentioned above. For instance, the PD3O algorithm has a smooth term, a composite term and proximable term.
BlockOperator
s may be a little tricky as they can be put in a matrix shape rather than just a row or column as it happens with BlockDataContainer
and BlockFunction
.
The use case I had in mind was to exactly simplify the setup of SPDHG.
Projection operator automatically returns a BlockOperator if passed a BlockGeometry
. Notice this only works for the ASTRA and TIGRE plugins.
# projection operator automatically returns a BlockOperator
A = ProjectionOperator(ig, ag_blockgeometry)
fis = [0.5*L2NormSquared(b=data) for data in splitted_data]
F = BlockFunction(*fis)
My aim was to remove the list comprehension and have the following loop. It's not shorter, but IMO is clearer.
F = BlockFunction()
for data in splitted_data:
F.append( 0.5 * L2NormSquared(b=data) )
Or do similarly to the ProjectionOperator
and L2NormSquared
should return a BlockFunction
if it recognises that it should, i.e. it is passed b
as a BlockDataContainer
:
F = 0.5 * L2NormSquared(b=splitted_data)
If the BlockOperator
has a matrix shape, e.g., TGV regulariser, instead of passing a ProjectionOperator
you can pass a BlockOperator(list_of_projection_operators)
. Now the final blockOperator will look sth like
operator = BlockOperator(GradOp, -IdentityOp,
ZeroOp, SymmetrizedGradOp,
Block_of_Projection_operator, ZeroOp, shape=(3,2))
in the case of explicit PDHG.
This A = ProjectionOperator(ig, ag_blockgeometry)
is very strange. I understand that it returns a BlockOperator
by appending list of operators, but it hides from the user the A_i's
in order to form
$$\minx \sum{i=1}^n F{i}(A{i}x) + G(x)$$
Also for standard PDHG( explicit with TV), we have $$\minx \sum{i=1}^2 F{i}(A{i}x) + G(x)$$
and we do
A = BlockOperator(ProjectionOperator, Grad)
F = BlockFunction(L2NormSquared, MixedL21Norm)
It may seem similar to the ChannelwiseOperator
when num_channels>1
, but in this case you map an image space to a splitted sinogram space...
IMO, I think list comprehension is quite clear and basic in python, where introducing a new syntax will confuse the users a lot... well everything depends on the documentation and examples.
In terms of the list of functions, one alternative is to introduce a domain
attribute to our functions, None
by default.
and BlockFunction
to inherit from SumFunction
. But it is not be very SIRF friendly.
So if you do F = L2NormSquared(domain=None or data.geometry, b=data)
we have the standard $$F(x) := ||x-b||^{2}$$.
If F = L2NormSquared(domain=splitted_data_block_geometry, b=splitted_data)
we have a separable sum aka BlockFunction
$$F(x{1}, x{2}, ..., x{n}) := \sum{i=1}^{n} f{i}(x{i}) = \sum{i=1}^{n} ||x{i}-b_{i}||^{2}$$.
and we can check/force to be a BlockFunction because every L2NormSquared
function has a proximal
method implemented.
If F = LeastSquares(A=BlockOperator(Ais), b=splitted_data)
for non-composite opt, we have a SumFunction
and we can check/force to be a SumFunction
because every LeastSquares
function does not have proximal
method implemented.
$$F(x) := \sum{i=1}^{n} f{i}(x) = \sum{i=1}^{n} ||A{i}x-b_{i}||^{2}$$.
and to make it a Stochastic Function, e.g., SGFunction, we do SGFunction(F.functions)
.
This will probably work for all functions, not only L2normSquared
.
It may be useful to be able to add a new function to a
BlockFunction
.Requires
append
. It may be useful to override__add__
method?Similarly for
BlockOperator