[Enhancement] Fix Generic Cost to actually account for Proximities from variable splitting methods

chaithyagr commented 4 years ago

Describe the bug Currently the linear composition operator carries out linear operation and adjoint operation, this must not be done as all the algorithms generally carry out the proximity on the wavelet domain coefficients directly.

https://github.com/CEA-COSMIC/ModOpt/blob/609ecded944b6bfd746106c917d6043219701fa4/modopt/opt/proximity.py#L319-L321

The fix is minor, we just need to remove these extra operations.

Are you planning to submit a Pull Request?

[X] Yes
[ ] No

UPDATED

Based on discussion, we plan to fix the Generic Cost Operator here so that it can accept x_new and y_new and calculate cost right in analysis formulation

zaccharieramzi commented 4 years ago

@chaithyagr I don't understand what you are saying. Linear composition prox is to define the prox of the following form: norm_w_known_prox(psi . x). For this you need the formula implemented, stated in the cited paper.

What you replaced in the PR is just for norm_w_known_prox(x).

chaithyagr commented 4 years ago

@zaccharieramzi I understand the point you are making. I completely agree that based on the given paper, the actual operator doesnt make any sense at all.

However, I am coming from usability standpoint. Let me use the case of SparseThreshold for this. SparseThreshold takes in a linear operator and a weight to threshold data which is coefficients in wavelet domain (assuming linear operator is wavelets). Now, condatvu, calls the prox_op with the coefficient itself as the argument, so if you see the implementation of SparseThreshold:

Op Method:

https://github.com/CEA-COSMIC/ModOpt/blob/609ecded944b6bfd746106c917d6043219701fa4/modopt/opt/proximity.py#L157-L159

Cost Method:

https://github.com/CEA-COSMIC/ModOpt/blob/609ecded944b6bfd746106c917d6043219701fa4/modopt/opt/proximity.py#L173

We can see that the linear operator exists only in cost operation while not in the op method. Similarly, my use case for LinearComposition is to use it as OWL(Linear_op(Image)) or GL(Linear_op(image)). However, under the constraint that condatvu calls the methods directly with coefficients applied with linear operator, I am constrained to using this formatting.

I could use condatvu directly with OWL(coeff) (as after all condatvu sends coefficients), however, in the calculation of cost, we have an issue as condatvu sends image domain as input to calculate cost. This is the reason why I could not proceed with https://github.com/CEA-COSMIC/pysap-mri/pull/74 where the tests that call condatvu with OWL norm were failing in both cases.

Now, the ideally right way must fix: 1) SparseThreshold and Condatvu and what it sends as input to proximities. ---> This in my opinion creates unwanted extra conversion in linear operation with adjoint operator in condatvu and later operator and adjoint operator in prox op.

or

2) Just make LinearComposition follow this methodology similar to SparseThreshold.

Please correct me if I got anything wrong above.

zaccharieramzi commented 4 years ago

I don't understand your point. LinearCompositionProx is right now correctly implemented, by which I mean that it implements the formula of the paper. SparseThreshold is basically the proximity operator of l1_norm(x). For usability in the primal-dual methods (variable splitting), we also allow to have a linear_op which allows a correct estimation of the cost associated.

However, LinearCompositionProx is not to be used generally with variable splitting methods. I would say therefore that you are using it wrong.

I didn't look at OWL or GL but basically they should be pretty similar to SparseThreshold in the sense that they would just have the linear operator in the cost computation (not familiar with either of these so I can't say for sure but the behaviour is probably the same).

zaccharieramzi commented 4 years ago

So yes I am seeing that OWL doesn't accept a linear op, it should for correct cost computation in the variable splitting cases, like in SparseThreshold. Same for GL. To avoid the issue raised in https://github.com/CEA-COSMIC/ModOpt/issues/74 we should make sure the linear op is optional in those 2.

chaithyagr commented 4 years ago

So what you are saying is basically have every Prox Op that we will implement, needs to have an extra linear_op, which by default would be Identity. This seems to be overkill just to handle the cases for variable splitting methods.

I was under the assumption that the purpose of LinearCompositionProx that we could use the usual prox_op for variable splitting methods. I surely agree that given the paper that it implements, it has already been implemented right.

Now we have following options: 1) Change all prox_op to accept linear_op. This includes OWL, GL, and mostly even elastic net and k-support norm (not sure of last 2 though, havent gone in depth on their proximities). ----> As I mentioned, this seems to be bad in my opinion.

2) Implement a separate class similar to LinearCompositionProx which basically takes in linear_op and ProxOp and returns an operator whose cost function is redefined. ------> This was the solution that I was proposing

3) Change the GenericCost function in pysap-mri:

https://github.com/CEA-COSMIC/pysap-mri/blob/c39764bf07c8dbefdb60c4e0be92fa3f96b477e2/mri/optimizers/utils/cost.py#L136

We can accept an extra parameter, which defines if we are handling variable splitting (mainly condatvu) and replace cost function with

gradient_cost(x) + prox_cost(y)

The last solution seems to be most general in my opinion. @zaccharieramzi your thoughts?

chaithyagr commented 4 years ago

Renaming the issue and also removing the tag that it is a bug and adding enhancement

zaccharieramzi commented 4 years ago

This seems to be overkill just to handle the cases for variable splitting methods.

However this is exactly what is done with SparseThreshold, I don't see why it's overkill then. I agree that we could also change the way variable splitting method compute their costs. It would be more sensible in terms of delegation, but the linear_op solution doesn't appear that bad to me.

I don't understand what solution 2. is, but yes the other 2 are sensible in my opinion. We would need to understand why there was such a design decision first that is: why have the linear_op in the proximity operator class, rather than the cost function computation.

chaithyagr commented 4 years ago

Oh sorry for bad writing. By 2 I just mean that we have an operator just to fix the cost call, which surely isnt right, as it doesn't exactly implement any function.

I will go ahead with fixing this in costOp in MRI and transfer this issue to pysap-mri

zaccharieramzi commented 4 years ago

Wait, we should first have @sfarrens on this because I want to understand the origin of this design first.

chaithyagr commented 4 years ago

Wait, we should first have @sfarrens on this because I want to understand the origin of this design first.

Oops... My bad, didnt get this update. I dont want to flip flop now, lets see what @sfarrens has to say, I will not work on this PR for now..

sfarrens commented 4 years ago

Hi @chaithyagr and @zaccharieramzi,

I am not sure I got what the exact question was, but I can try at least to explain the reason for how ModOpt handles proximity operators.

For any given proximity operator the op method is intended to perform the operation required by "prox" in a given optimisation algorithm. e.g. for SparseThreshold this returns by default the soft thresholding of whatever is passed to prox() by the algorithm. I think this is pretty clear and is of course the primary purpose of these operators.

This cost method on the other hand aims to return the contribution this type of regularisation (e.g. l_1, l_2, etc.) to the overall cost function. This is a little bit less obvious, but the idea was that for a given algorithm, by simply changing the proximity operator, you should automatically get the corresponding cost without having to change anything else. The issue with this is that if the regularisation depends on the linear operator(s) used then this needs to be passed to the proximity operator when the object is defined. It is possible that this framework does not easily extend to all use cases, in which case we can discuss alternatives to handing the cost calculation.

chaithyagr commented 4 years ago

Well I think @zaccharieramzi 's point is that (please correct me if I am wrong) we could have the proximities just to handle what they intend to do, ie the proximity and cost (with no idea of the linear operator). It could be up to the algorithm to call the cost and proximities right. In my opinion, we are duplicating code by providing linear_op to both the algorithm and the prox_op. In our case here, I need to fix the generic_cost to handle the costs differently for analysis formulation.

zaccharieramzi commented 4 years ago

I agree with what @chaithyagr has just said.

sfarrens commented 4 years ago

OK, sounds good to me.

chaithyagr commented 4 years ago

@zaccharieramzi can we say that this is now tackled and close this? Is there any other opens?

zaccharieramzi commented 4 years ago

Well except the fact we still have this args thing yes I think it's ok

chaithyagr commented 4 years ago

Closing.

CEA-COSMIC / pysap-mri

[Enhancement] Fix Generic Cost to actually account for Proximities from variable splitting methods #80