JuliaFirstOrder / ProximalOperators.jl

Proximal operators for nonsmooth optimization in Julia
Other
131 stars 24 forks source link

An immutable type ProximalPoint #16

Open lostella opened 7 years ago

lostella commented 7 years ago

This is just a design idea, but I kinda like it.

From the beginning of ProximalOperators, we had prox! and prox returning both the proximal point y and the value of f at y. This has several advantages: essentially, in many cases evaluating the prox of a function entails almost-computing the function value at the resulting point. Such value may be used, for example, in a stopping criterion, or may be of any other use for an algorithm. A notable example for this is the nuclear norm: once you have soft-thresholded the singular values, you better store their sum somewhere: otherwise evaluating f at the resulting point will be another SVD - quite expensive.

Actually, right now prox! only returns the function value, since the proximal point is written to an array provided by the user.

On the other hand, I can see the advantage of having the proximal point as the only returned argument. It's a bit more like it is in paper: you ask for the proximal point, you get the proximal point.

One way I can imagine to obviate this problem is to have an Array-like type, encapsulating the function value along with the proximal point. It should be immutable by law (point & function value shall be together until the last bit of their lives, in the name of provable correctness).

immutable ProximalPoint <: AbstractArray
  f::ProximableFunction
  point::AbstractArray
  value::Real
end

An object of this type shall be created upon prox evaluation. Then the call method would be:

function (f::ProximableFunction)(x::ProximalPoint)
  if object_id(f) == object_id(x.f)
    return x.value
  else
    return f(x.point) # this is already defined for all functions
  end
end

This way one can decide to do:

y = prox(f, x, gamma)
v = f(y) # this doesn't compute anything

Would this complicate things? That is, would this require us to (trivially) redefine several other things for the new ProximalPoint type? Things that, when called on x::ProximalPoint, should be re-called with argument x.point instead. It would be nice if it were possible to declare: whatever acts on AbstractArray acts on ProximalPoint by looking at its point field (unless otherwise specified: thank you, multiple dispatch).

mfalt commented 7 years ago

I think this is a very interesting idea, but I think we have to think it through really carefully. I think there are a couple of trade-offs regarding how much data should be saved to be efficient on multiple calls versus being efficient on a single call. Do you think this should be the output even for prox!? This could potentially lead to a lot of extra allocations, for example in

for i = 1:100000
   prox!(y,f,x)
   x = 0.5*x .+ 0.5*y
end

we would allocate one point::AbstractArray in every call. Unless we let this element refer to the vector y, in which case the user might inadvertently change the elements inside the ProximalPoint. (Immutable only means that it will reference the same array, the elements in the array might still change).

As a side-note: We should probably not let the definition be

immutable ProximalPoint <: AbstractArray
  f::ProximableFunction
  point::AbstractArray
  value::Real
end

but rather

immutable ProximalPoint{F<:ProximableFunction, P<:AbstractArray, V<:Real}
  f::F
  point::P
  value::V
end

I created a related issue #23 for the existing problems of this type.