Why no getindex() ? - Githubissues

JuliaLinearAlgebra / LinearMaps.jl

A Julia package for defining and working with linear maps, also known as linear transformations or linear operators acting on vectors. The only requirement for a LinearMap is that it can act on a vector (by multiplication) efficiently.

Other

303 stars 42 forks source link

Why no getindex() ? #38

Closed JeffFessler closed 2 years ago

JeffFessler commented 5 years ago

Why is getindex not supported (quite explicitly in the README) for LinearMaps?

It seems like it would be pretty easy to provide the functionality as a convenience to users even if it would be somewhat inefficient. For example here is a rough start on A[i,j]

function getindex(A::LinearMap, i::Int, j::Int) # for A[i,j]
  e = zeros(size(A,2)); e[j] = 1
  tmp = A * e
  return tmp[i]
end

I am less sure how getindex works for A[:,j] which is what I really want. Of course I could do this type of overloading myself locally, but why not benefit all users? I've done this in Matlab by the way so I expected it to be even easier in Julia!

dkarrasch commented 5 years ago

Isn't A[:,j] simply the image of your e under A, A*e? In some sense, it's a special case of *(::LinearMap,::Vector). I wouldn't quite sell it as getting values at indices of an array (getindex), but as the result of multiplication.

JeffFessler commented 5 years ago

Yes A[:,j] is simply A * e but I don't know how to overload that case, due to the Colon, and I don't understand why that case is not already supported by LinearMaps! Here is a rough attempt that I am sure is incorrect:

function getindex(A::LinearMap, ":", j::Int) # for A[:,j]
  e = zeros(size(A,2)); e[j] = 1
  return A * e
 end

My real goal here is that I want LinearMaps to be a subtype of AbstractMatrix (which requires getindex to be supported) so that I can have functions like function lscg(A::AbstractMatrix, y:AbstractVector, x:AbstractVector) that can be debugged using an ordinary matrix and then later run using a LinearMap. I realize I could use Union{LinearMaps,AbstractMatrix} but that is ugly and I am confident that it is possible to make LinearMaps general enough to be a subtype of AbstractMatrix because I have done similar things in Matlab.

Using multiplication as a default means to provide getindex functionality seems consistent with the philosophy here that LinearMaps are just multipliers.

I also have an idea about setindex but that's more radical so let's first see if we can resolve the question of why getindex is explicitly not supported here, or whether it is just waiting for someone like me to ask for it and to offer to help implement it...

dkarrasch commented 5 years ago

First, there is a Colon type, so your function could read

function getindex(A::LinearMap, ::Colon, j::Int)

if I'm not mistaken.

and I don't understand why that case is not already supported by LinearMaps!

I don't think this is a fair statement. After all, all you're using is multiplication with a very special vector, which is supported by LinearMaps, it's its main purpose.

More generally, I have a very hard time understanding what you're trying to do. If you want everything to work like with a matrix, then why not work with matrices? You can convert any LinearMap into a matrix, and supply that to your function, which you can debug as usual. Which functionality of LinearMaps are you interested in? "All" we provide is linear map-vector multiplication for linear maps which are not given by matrices. If you want it to be like a matrix, then what is left from LinearMaps that attracts your interest? I'm asking this even more so because of your interest in setindex!, which puzzles me even more. How would you modify a linear transformation given by a program (think of cumsum which we use in the tests, or Fourier transformation, convolutions, etc.) by changing a number in its matrix representation?

I think one of the main reasons why there is no getindex(A::LinearMap, i, j) is because it is very inefficient: you need to multiply A by a vector e, which, in the intended use cases here, is expected to be already so expensive, that matrix-matrix-multiplication is considered as prohibitively expensive. From the resulting (supposedly huge) vector object you throw away all but a single number. That doesn't sound efficient. I guess one could include the colon-functionality as a shorthand for A*e, which is done analogously in the Matrix-conversion for all e, to avoid first computing the whole matrix representation of A and then throw away everything but a single column. Again, from an efficiency point of view this perhaps is not very good, because that would allocate a new vector, where in most performance-critical application cases you would do the multiplication by an in-place mul! to avoid (repeated) allocation. From my understanding, getindex is meant to be allocating even in the Matrix-case.

BTW, for an implementation of matrix-free iterative "matrix" algorithms see the packages listed at the end of our README.

Jutho commented 5 years ago

@JeffFessler , what is lscg. Is that an abbreviation for least square conjugate gradient. You should consider using ducktyping, i.e. not specifying the type of A at all, instead of strange unions. You don't gain anything. A LinearMap is exactly not an AbstractMatrix, because indeed the AbstractArray type holds some premise that there is a reasonably efficient getindex. A LinearMap holds a weaker premise, namely that it can be multiplied with a vector, that's it. As @dkarrasch mentioned, for debugging purposes you can always convert your LinearMap to a Matrix, and when all debugging is done and you have some iterative code for linear systems or eigenvalues or any other application where you only need matrix vector multiplications, put back the LinearMap and see that you obtain the same result. But this then also allows you to represent matrices which you could not represent as a Matrix (because it would cost too much memory and be hugely inefficient).

JeffFessler commented 5 years ago

@Jutho and @dkarrasch thanks for the fast replies.

I especially appreciate the explanation about Colon because that let me finish the draft code I have been preparing for your consideration. I think I got things off on the wrong foot here with my initial "Why not" question. I should have started by saying that I am delighted to find your LinearMaps package here, because I spent years refining my own similar version in Matlab. I am so glad I don't have to start from scratch in Julia!

Based on my Matlab experience, having built-in indexing ability like A[:,j] is very convenient for users in a LinearMap type of package. Sure a user could do that themselves by multiplying by a unit vector, but isn't the point of a high-level language is to make it convenient for users, and to make operations look like the math? The urge to write the matrix-like syntaxA*x instead of calling f(x) is the impetus behind LinearMaps in the first place, right? So it is natural to offer other matrix-like syntax like A[:,j]as well, perhaps with caveats to the user that many such indexing operations are inefficient. My reading of the AbstractMatrix specification does not require that getindex() be efficient, only that it be available.

I am going to make a fork with a getindex.jl file for your consideration. If you find it to be a useful addition then that would be great and I'll next share with you my (more radical and pretty inefficient, but possible) setindex.jl code, so that we can complete the steps towards satisfying LinearMap <: AbstractArray

Thanks!

In case you are curious about the Matlab version, the high-level link is here: http://web.eecs.umich.edu/~fessler/code/index.html and the specific code is here http://web.eecs.umich.edu/~fessler/irt/irt/systems/@fatrix2/ and the documentation describing philosophy very similar to that of LinearMaps is here: http://web.eecs.umich.edu/~fessler/irt/irt/doc/doc.pdf

BTW, yes cgls is CG LS and that is just an example. @Jutho I am currently resorting to ducktyping but I would prefer to be explicit about argument types for functions that have many arguments. You say I "gain nothing" but I feel that clarity of purpose is gained by specifying the required variable type, and thereby preventing users from passing a wrong type object (like a 3D Array) to a function requires a matrix-type object.

dkarrasch commented 5 years ago

@JeffFessler I'm afraid I still haven't quite understood what's the actual purpose of getindex. For LinearMaps of small size, you can convert it to a Matrix and work with it as you're used to. For large LinearMaps, you may not wish to do that to not flood your memory. But for these cases, any getindex will be slow either when compared to usual reading out of memory. Any generic code that builds on it and intends to operate on LinearMaps will be likely horribly slow. Except for "convenience", I couldn't extract a concrete use case so far.

Maybe, there is a difference in intentions between your MATLAB package and @Jutho's LinearMaps.jl? As for this package here, I think the intention is to view a LinearMap as a single entity, that has properties such as size, symmetry, definiteness etc., an action on vectors and rules how to interact with other entities on the same level of abstraction. In fact, multiplying a LinearMap with a Matrix A turns A into a WrappedMap and interprets the result as a CompositeMap, thereby lifting A from a collection of entities, an array of numbers, to an individual entity, a LinearMap. My understanding of this package (I joined much later than when @Jutho developed this package) is that complete Matrix functionality like getindex and setindex! is not provided deliberately, to focus on the absolute minimum of properties that are of interest for a linear map and to not tempt anyone to write foreseeably inefficient code. Clearly, this is not in conflict with developing, for instance, efficient iterative "matrix" methods like solvers, eigensolvers, SVDs, as can be seen from the listed packages at the bottom of our README, which (almost) completely cooperate with LinearMaps.jl as is (a few methods in IterativeSolvers.jl are not (yet) LinearMaps.jl ready, but most are).

@JeffFessler could you try to be more specific about what exactly you're targetting at? Why exactly do you need LinearMaps to be a subtype of AbstractArrays? Conversely, for your CG method, what is wrong with the approach @Jutho and I suggested above:

leave the A argument unspecified in the function signature, but assume it's a matrix for the time being and test/debug it with matrices;
if you have linear maps which are inherently given as functions, convert it to a Matrix or, in case that's relevant, a SparseMatrixCSC;
make sure that the actual algorithm only uses "matrix"-vector multiplication (* or mul!) and vector manipulations, queries on size or symmetry, and whatever else is already provided.

The resulting method should then apply to LinearMaps without extra effort.

JeffFessler commented 5 years ago

Thanks for the reply.

I understand the reluctance about making LinearMaps a subtype of AbstractArray - I don't think I can convince you about that. You either like duck typing or you don't. If you try to pass a 3D array to the svd command you get an error from the caller that it is unsupported. If svd were written with duck typing, like you are recommending for cgls, then you'd get some internal error that may be harder for the user to understand. I prefer the outer error for feedback to users...

I don't understand the reluctance to provide the convenience of getindex support, especially A[:,] because implementing that takes fewer characters to type than the text in most of the replies in this thread! And I've already provided the code in a pull request, with tests.

Support for getindex functionality is not uncommon in linear operator tools!

The Matlab "SPOT" package has both subsref and subsasgn [https://github.com/mpf/spot]()

And the Julia package LinearOperators - an alternative to LinearMaps https://github.com/JuliaSmoothOptimizers/LinearOperators.jl also has a version of getindex showing another example of why this suggestion is reasonable. I don't like the way it is done in LinearOperators so that is why I am pursuing this with LinearMaps. Thanks for your consideration.

Jutho commented 5 years ago

When I first started using Julia, now several years ago (Julia 0.3 or maybe even 0.2), I also assumed that any linear map should be a subtype of AbstractMatrix; you might even find comments on old github issues of Julia about this. However, I was then thought that this does not need to be. AbstractMatrix is indeed an alias for AbstractArray{T,2}, and AbstractArray is a type hierarchy for container-like objects, i.e. an efficient getindex is essentially the defining property of an AbstractArray. More generally, most type hierarchies are about specific data structures for representing something, not about the mathematical (or other structure) of what they represent. It is often not a good or feasible idea to try to use the type system to encode mathematical structure, for various reasons, lack of multiple inheritance for example, but foremost the fact that there are often various fundamentally different ways to represent a certain mathematical object. I don't think you want to represent a linear map on a finite-dimensional vector space in the same way as e.g. a linear operator such as a derivative on a function space, ...

Returning to the specific proposal, I still don't understand the specific use case. If you want to apply an algorithm that requires getindex, convert the linear map to a Matrix or SparseMatrixCSC. It will be much more efficient, since at least you will be sure that it will create every column A[:,i] just once, i.e. it will apply the linear map no more than size(A,2) times. If the Matrix is too costly to be created, then the algorithm that you want to apply (e.g. svd, qr, inv) will certainly not work.

If the algorithm that you want to apply does not need an efficient getindex but say, only multiplications with an AbstractVector, then don't restrict the type of the linear map argument. As said above, the type system is not about mathematical structure and there is no common super type for all objects that behave as a linear map. Any user/developer can have his own use case for representing a linear map in a specific way, and if you develop an algorithm that leaves the linear map argument unspecified, it can immediately be combined with that other user's linear map like object that you did not foresee.

Similarly, in Julia, there is no specific supertype for e.g. iterators, because many objects behave as iterators and represent them in various ways. The type hierarchy is often about the specific representation. And all methods that accept iterators just accept any type. It's up to the user not to insert a non-iterable object into a method that expects an iterator. If you do, you will likely get a MethodError that your argument does not support Base.iterate. And likewise, I think it is good to get a MethodError that a linear map does not support getindex. If you want to use an algorithm that really needs getindex, converting to a Matrix should be the way to go.

In fact, the reason that I personally moved away from using LinearMaps.jl myself, is because it did not go far enough. Not only do I want my linear maps to be completely generic, I also do not want to restrict to AbstractVector for my vector types, when using iterative algorithms such as Krylov methods. That's why I went on to create KrylovKit.jl, where linear maps are arbitrary functions, and vectors are also arbitrary types that support a given interface (e.g. LinearAlgebra.dot, LinearAlgebra.norm, scalar multiplication and addition, ...).

JeffFessler commented 5 years ago

I've provided url's for 3 distinct packages that all provide the convenience of getindex and you remain unconvinced so I will not try to persuade you further.

You've made an key observation that "there is no common super type for all objects that behave as a linear map." Perhaps that is what I am really missing here. I wish Julia had an AbstractLinearOp as a supertype (with a much more limited set of requirements than AbstractArray has, where things like Matrix and LinearMap could then be subtypes.

Jutho commented 5 years ago

I wish Julia had an AbstractLinearOp as a supertype (with a much more limited set of requirements than AbstractArray has, where things like Matrix and LinearMap could then be subtypes.

That's an interesting proposal, but how would it work. How would you only make AbstractArray{T,2} a subtype but not AbstractArray{T,N !=2}? And what if I have a different problem where the matrix does not actually represent a linear map, but a state (e.g. a quantum mechanical density matrix), and the "linear map" acting on this density matrix is a completely positive superoperator. That is the problem with trying to use the type system for representing mathematical structure.

I think what you are looking for is the concept of an interface or traits system, which is independent from type hierarchies, exists in many languages, and has been discussed in Julia, but so far has not been formalized or acquired syntax. There are packages for prototyping and experimenting with this, e.g. https://github.com/andyferris/Traitor.jl

dkarrasch commented 5 years ago

@JeffFessler If you don't like the duck-typing (which is used in Arpack.jl, TSVD.jl, and IterativeSolvers.jl), you can specialize your argument to Union{LinearMap,AbstractMatrix} without performance loss. The function gets compiled for the concrete type of the arguments when first called. Note that, for instance, both LinearMap and AbstractMatrix are abstract types, and are "concretized" at compilation anyway.

JeffFessler commented 5 years ago

Just in case anyone else lands here looking for getindex, I've implemented it in this toolbox: https://github.com/JeffFessler/MIRT.jl/blob/master/src/system/linear-map/getindex.jl

JeffFessler commented 5 years ago

In case anyone else comes looking for getindex I have now made a small stand-alone overlay to LinearMaps that provides both getindex and setindex! and is a subtype of AbstractArray. https://github.com/JeffFessler/LinearMapsAA.jl I thank you both for your suggestions that helped me figure out how to do it.

bonevbs commented 3 years ago

@dkarrasch If I may add to the discussion: Having the option of a custom getindex could be very useful. Oftentimes you might have custom methods for computing individual entries and you might want to use that alongside your efficient methods for multiplication. Fast multipole method comes to mind. If you are curious I have such a use case with a custom type written and used here: https://github.com/bonevbs/HssMatrices.jl/blob/main/src/linearmap.jl (I just found out about your package so I am aware that the type names collide)

dkarrasch commented 3 years ago

I think it is perfectly legitimate to define getindex methods for your own LinearMap subtype, for which you know the context and the cost of component evaluation. And I think you take the right approach, i.e., have a thin wrapper. I'd do it roughly like this:

struct IndexableLinearMap{T,A<:LinearMap,F} <: LinearMap{T}
    lmap::A
    getidxs::F
end

# redirect most functions to the lmap field
adjoint(A::IndexableLinearMap) = IndexableLinearMap(adjoint(A), (i,j) -> conj(A.getidxs(j,i)))

mul!(y, A::IndexableLinearMap, x) = mul!(y, A.lmap, x)

# all your getindex-overloads for IndexableLinearMap

In fact, this might be a feature for the package, because it gives indexability but at the same time forces the user to take responsability of it. The big danger is that if getindex works silently generically, this may come with huge performance penalties as a surprise.

bonevbs commented 3 years ago

I agree, this is probably the best way to do it. I only recently realised the dangers of Julia's type system being the silent use of fall-back routines, at the cost of performance.

JeffFessler commented 3 years ago

What about having a getindex method be an optional argument when creating a LinearMap in the first place? In the same way that an adjoint method is an optional method argument because not all use cases need it. This approach would have the benefit of forcing the user to take responsibility, but also simplifying the process for the user because no "thin" wrapper would be needed.

Jutho commented 3 years ago

If I recall correctly, constant propagation also works for keyword arguments in Julia 1.x for some lower bound on x, so this could be possible while still having a dedicated IndexableLinearMap type and without creating type instabilities.

dkarrasch commented 3 years ago

I think I have a draft here. Will put it out tomorrow.

dkarrasch commented 3 years ago

Reviews and comments welcome in #145.