Closed JeffFessler closed 2 years ago
Isn't A[:,j]
simply the image of your e
under A
, A*e
? In some sense, it's a special case of *(::LinearMap,::Vector)
. I wouldn't quite sell it as getting values at indices of an array (getindex
), but as the result of multiplication.
Yes A[:,j]
is simply A * e
but I don't know how to overload that case, due to the Colon, and I don't understand why that case is not already supported by LinearMaps! Here is a rough attempt that I am sure is incorrect:
function getindex(A::LinearMap, ":", j::Int) # for A[:,j]
e = zeros(size(A,2)); e[j] = 1
return A * e
end
My real goal here is that I want LinearMaps to be a subtype of AbstractMatrix (which requires getindex
to be supported) so that I can have functions like
function lscg(A::AbstractMatrix, y:AbstractVector, x:AbstractVector)
that can be debugged using an ordinary matrix and then later run using a LinearMap.
I realize I could use Union{LinearMaps,AbstractMatrix}
but that is ugly and I am confident that it is possible to make LinearMaps general enough to be a subtype of AbstractMatrix
because I have done similar things in Matlab.
Using multiplication as a default means to provide getindex
functionality seems consistent with the philosophy here that LinearMaps are just multipliers.
I also have an idea about setindex
but that's more radical so let's first see if we can resolve the question of why getindex
is explicitly not supported here, or whether it is just waiting for someone like me to ask for it and to offer to help implement it...
First, there is a Colon
type, so your function could read
function getindex(A::LinearMap, ::Colon, j::Int)
if I'm not mistaken.
and I don't understand why that case is not already supported by LinearMaps!
I don't think this is a fair statement. After all, all you're using is multiplication with a very special vector, which is supported by LinearMaps
, it's its main purpose.
More generally, I have a very hard time understanding what you're trying to do. If you want everything to work like with a matrix, then why not work with matrices? You can convert any LinearMap
into a matrix, and supply that to your function, which you can debug as usual. Which functionality of LinearMaps
are you interested in? "All" we provide is linear map-vector multiplication for linear maps which are not given by matrices. If you want it to be like a matrix, then what is left from LinearMaps
that attracts your interest? I'm asking this even more so because of your interest in setindex!
, which puzzles me even more. How would you modify a linear transformation given by a program (think of cumsum
which we use in the tests, or Fourier transformation, convolutions, etc.) by changing a number in its matrix representation?
I think one of the main reasons why there is no getindex(A::LinearMap, i, j)
is because it is very inefficient: you need to multiply A
by a vector e
, which, in the intended use cases here, is expected to be already so expensive, that matrix-matrix-multiplication is considered as prohibitively expensive. From the resulting (supposedly huge) vector object you throw away all but a single number. That doesn't sound efficient. I guess one could include the colon-functionality as a shorthand for A*e
, which is done analogously in the Matrix
-conversion for all e
, to avoid first computing the whole matrix representation of A
and then throw away everything but a single column. Again, from an efficiency point of view this perhaps is not very good, because that would allocate a new vector, where in most performance-critical application cases you would do the multiplication by an in-place mul!
to avoid (repeated) allocation. From my understanding, getindex
is meant to be allocating even in the Matrix
-case.
BTW, for an implementation of matrix-free iterative "matrix" algorithms see the packages listed at the end of our README.
@JeffFessler , what is lscg
. Is that an abbreviation for least square conjugate gradient. You should consider using ducktyping, i.e. not specifying the type of A
at all, instead of strange unions. You don't gain anything. A LinearMap
is exactly not an AbstractMatrix
, because indeed the AbstractArray
type holds some premise that there is a reasonably efficient getindex
. A LinearMap
holds a weaker premise, namely that it can be multiplied with a vector, that's it. As @dkarrasch mentioned, for debugging purposes you can always convert your LinearMap
to a Matrix
, and when all debugging is done and you have some iterative code for linear systems or eigenvalues or any other application where you only need matrix vector multiplications, put back the LinearMap
and see that you obtain the same result. But this then also allows you to represent matrices which you could not represent as a Matrix
(because it would cost too much memory and be hugely inefficient).
@Jutho and @dkarrasch thanks for the fast replies.
I especially appreciate the explanation about Colon
because that let me finish the draft code I have been preparing for your consideration.
I think I got things off on the wrong foot here with my initial "Why not" question.
I should have started by saying that I am delighted to find your LinearMaps package here,
because I spent years refining my own similar version in Matlab.
I am so glad I don't have to start from scratch in Julia!
Based on my Matlab experience, having built-in indexing ability like A[:,j]
is very convenient for users in a LinearMap type of package. Sure a user could do that themselves by multiplying by a unit vector, but isn't the point of a high-level language is to make it convenient for users, and to make operations look like the math? The urge to write the matrix-like syntaxA*x
instead of calling f(x)
is the impetus behind LinearMaps in the first place, right? So it is natural to offer other matrix-like syntax like A[:,j]
as well, perhaps with caveats to the user that many such indexing operations are inefficient. My reading of the AbstractMatrix
specification does not require that getindex()
be efficient, only that it be available.
I am going to make a fork with a getindex.jl
file for your consideration. If you find it to be a useful addition then that would be great and I'll next share with you my (more radical and pretty inefficient, but possible) setindex.jl code, so that we can complete the steps towards satisfying
LinearMap <: AbstractArray
Thanks!
In case you are curious about the Matlab version, the high-level link is here: http://web.eecs.umich.edu/~fessler/code/index.html and the specific code is here http://web.eecs.umich.edu/~fessler/irt/irt/systems/@fatrix2/ and the documentation describing philosophy very similar to that of LinearMaps is here: http://web.eecs.umich.edu/~fessler/irt/irt/doc/doc.pdf
BTW, yes cgls is CG LS and that is just an example. @Jutho I am currently resorting to ducktyping but I would prefer to be explicit about argument types for functions that have many arguments. You say I "gain nothing" but I feel that clarity of purpose is gained by specifying the required variable type, and thereby preventing users from passing a wrong type object (like a 3D Array) to a function requires a matrix-type object.
@JeffFessler I'm afraid I still haven't quite understood what's the actual purpose of getindex
. For LinearMap
s of small size, you can convert it to a Matrix
and work with it as you're used to. For large LinearMap
s, you may not wish to do that to not flood your memory. But for these cases, any getindex
will be slow either when compared to usual reading out of memory. Any generic code that builds on it and intends to operate on LinearMap
s will be likely horribly slow. Except for "convenience", I couldn't extract a concrete use case so far.
Maybe, there is a difference in intentions between your MATLAB package and @Jutho's LinearMaps.jl
? As for this package here, I think the intention is to view a LinearMap
as a single entity, that has properties such as size, symmetry, definiteness etc., an action on vectors and rules how to interact with other entities on the same level of abstraction. In fact, multiplying a LinearMap
with a Matrix
A
turns A
into a WrappedMap
and interprets the result as a CompositeMap
, thereby lifting A
from a collection of entities, an array of numbers, to an individual entity, a LinearMap
. My understanding of this package (I joined much later than when @Jutho
developed this package) is that complete Matrix
functionality like getindex
and setindex!
is not provided deliberately, to focus on the absolute minimum of properties that are of interest for a linear map and to not tempt anyone to write foreseeably inefficient code. Clearly, this is not in conflict with developing, for instance, efficient iterative "matrix" methods like solvers, eigensolvers, SVDs, as can be seen from the listed packages at the bottom of our README, which (almost) completely cooperate with LinearMaps.jl
as is (a few methods in IterativeSolvers.jl
are not (yet) LinearMaps.jl ready, but most are).
@JeffFessler could you try to be more specific about what exactly you're targetting at? Why exactly do you need LinearMaps
to be a subtype of AbstractArray
s? Conversely, for your CG method, what is wrong with the approach @Jutho and I suggested above:
A
argument unspecified in the function signature, but assume it's a matrix for the time being and test/debug it with matrices;Matrix
or, in case that's relevant, a SparseMatrixCSC
;*
or mul!
) and vector manipulations, queries on size or symmetry, and whatever else is already provided.The resulting method should then apply to LinearMap
s without extra effort.
Thanks for the reply.
I understand the reluctance about making LinearMaps
a subtype of AbstractArray
- I don't think I can convince you about that. You either like duck typing or you don't. If you try to pass a 3D array to the svd
command you get an error from the caller that it is unsupported. If svd
were written with duck typing, like you are recommending for cgls
, then you'd get some internal error that may be harder for the user to understand. I prefer the outer error for feedback to users...
I don't understand the reluctance to provide the convenience of getindex
support, especially A[:,]
because implementing that takes fewer characters to type than the text in most of the replies in this thread! And I've already provided the code in a pull request, with tests.
Support for getindex
functionality is not uncommon in linear operator tools!
The Matlab "SPOT" package has both subsref
and subsasgn
[https://github.com/mpf/spot]()
And the Julia package LinearOperators
- an alternative to LinearMaps
https://github.com/JuliaSmoothOptimizers/LinearOperators.jl
also has a version of getindex
showing another example of why this suggestion is reasonable.
I don't like the way it is done in LinearOperators
so that is why I am pursuing this with LinearMaps
. Thanks for your consideration.
When I first started using Julia, now several years ago (Julia 0.3 or maybe even 0.2), I also assumed that any linear map should be a subtype of AbstractMatrix
; you might even find comments on old github issues of Julia about this. However, I was then thought that this does not need to be. AbstractMatrix
is indeed an alias for AbstractArray{T,2}
, and AbstractArray
is a type hierarchy for container-like objects, i.e. an efficient getindex
is essentially the defining property of an AbstractArray
. More generally, most type hierarchies are about specific data structures for representing something, not about the mathematical (or other structure) of what they represent. It is often not a good or feasible idea to try to use the type system to encode mathematical structure, for various reasons, lack of multiple inheritance for example, but foremost the fact that there are often various fundamentally different ways to represent a certain mathematical object. I don't think you want to represent a linear map on a finite-dimensional vector space in the same way as e.g. a linear operator such as a derivative on a function space, ...
Returning to the specific proposal, I still don't understand the specific use case. If you want to apply an algorithm that requires getindex
, convert the linear map to a Matrix
or SparseMatrixCSC
. It will be much more efficient, since at least you will be sure that it will create every column A[:,i]
just once, i.e. it will apply the linear map no more than size(A,2)
times. If the Matrix
is too costly to be created, then the algorithm that you want to apply (e.g. svd
, qr
, inv
) will certainly not work.
If the algorithm that you want to apply does not need an efficient getindex
but say, only multiplications with an AbstractVector
, then don't restrict the type of the linear map argument. As said above, the type system is not about mathematical structure and there is no common super type for all objects that behave as a linear map. Any user/developer can have his own use case for representing a linear map in a specific way, and if you develop an algorithm that leaves the linear map argument unspecified, it can immediately be combined with that other user's linear map like object that you did not foresee.
Similarly, in Julia, there is no specific supertype for e.g. iterators, because many objects behave as iterators and represent them in various ways. The type hierarchy is often about the specific representation. And all methods that accept iterators just accept any type. It's up to the user not to insert a non-iterable object into a method that expects an iterator. If you do, you will likely get a MethodError
that your argument does not support Base.iterate
. And likewise, I think it is good to get a MethodError
that a linear map does not support getindex
. If you want to use an algorithm that really needs getindex
, converting to a Matrix
should be the way to go.
In fact, the reason that I personally moved away from using LinearMaps.jl myself, is because it did not go far enough. Not only do I want my linear maps to be completely generic, I also do not want to restrict to AbstractVector
for my vector types, when using iterative algorithms such as Krylov methods. That's why I went on to create KrylovKit.jl, where linear maps are arbitrary functions, and vectors are also arbitrary types that support a given interface (e.g. LinearAlgebra.dot
, LinearAlgebra.norm
, scalar multiplication and addition, ...).
I've provided url's for 3 distinct packages that all provide the convenience of getindex
and you remain unconvinced so I will not try to persuade you further.
You've made an key observation that "there is no common super type for all objects that behave as a linear map." Perhaps that is what I am really missing here. I wish Julia had an AbstractLinearOp
as a supertype (with a much more limited set of requirements than AbstractArray
has, where things like Matrix
and LinearMap
could then be subtypes.
I wish Julia had an AbstractLinearOp as a supertype (with a much more limited set of requirements than AbstractArray has, where things like Matrix and LinearMap could then be subtypes.
That's an interesting proposal, but how would it work. How would you only make AbstractArray{T,2}
a subtype but not AbstractArray{T,N !=2}
? And what if I have a different problem where the matrix does not actually represent a linear map, but a state (e.g. a quantum mechanical density matrix), and the "linear map" acting on this density matrix is a completely positive superoperator. That is the problem with trying to use the type system for representing mathematical structure.
I think what you are looking for is the concept of an interface or traits system, which is independent from type hierarchies, exists in many languages, and has been discussed in Julia, but so far has not been formalized or acquired syntax. There are packages for prototyping and experimenting with this, e.g. https://github.com/andyferris/Traitor.jl
@JeffFessler If you don't like the duck-typing (which is used in Arpack.jl
, TSVD.jl
, and IterativeSolvers.jl
), you can specialize your argument to Union{LinearMap,AbstractMatrix}
without performance loss. The function gets compiled for the concrete type of the arguments when first called. Note that, for instance, both LinearMap
and AbstractMatrix
are abstract types, and are "concretized" at compilation anyway.
Just in case anyone else lands here looking for getindex
, I've implemented it in this toolbox:
https://github.com/JeffFessler/MIRT.jl/blob/master/src/system/linear-map/getindex.jl
In case anyone else comes looking for getindex
I have now made a small stand-alone overlay to LinearMaps
that provides both getindex
and setindex!
and is a subtype of AbstractArray
.
https://github.com/JeffFessler/LinearMapsAA.jl
I thank you both for your suggestions that helped me figure out how to do it.
@dkarrasch If I may add to the discussion: Having the option of a custom getindex could be very useful. Oftentimes you might have custom methods for computing individual entries and you might want to use that alongside your efficient methods for multiplication. Fast multipole method comes to mind. If you are curious I have such a use case with a custom type written and used here: https://github.com/bonevbs/HssMatrices.jl/blob/main/src/linearmap.jl (I just found out about your package so I am aware that the type names collide)
I think it is perfectly legitimate to define getindex
methods for your own LinearMap
subtype, for which you know the context and the cost of component evaluation. And I think you take the right approach, i.e., have a thin wrapper. I'd do it roughly like this:
struct IndexableLinearMap{T,A<:LinearMap,F} <: LinearMap{T}
lmap::A
getidxs::F
end
# redirect most functions to the lmap field
adjoint(A::IndexableLinearMap) = IndexableLinearMap(adjoint(A), (i,j) -> conj(A.getidxs(j,i)))
mul!(y, A::IndexableLinearMap, x) = mul!(y, A.lmap, x)
# all your getindex-overloads for IndexableLinearMap
In fact, this might be a feature for the package, because it gives indexability but at the same time forces the user to take responsability of it. The big danger is that if getindex
works silently generically, this may come with huge performance penalties as a surprise.
I agree, this is probably the best way to do it. I only recently realised the dangers of Julia's type system being the silent use of fall-back routines, at the cost of performance.
What about having a getindex
method be an optional argument when creating a LinearMap
in the first place?
In the same way that an adjoint
method is an optional method argument because not all use cases need it.
This approach would have the benefit of forcing the user to take responsibility, but also simplifying the process for the user because no "thin" wrapper would be needed.
If I recall correctly, constant propagation also works for keyword arguments in Julia 1.x for some lower bound on x, so this could be possible while still having a dedicated IndexableLinearMap
type and without creating type instabilities.
I think I have a draft here. Will put it out tomorrow.
Reviews and comments welcome in #145.
Why is
getindex
not supported (quite explicitly in the README) for LinearMaps?It seems like it would be pretty easy to provide the functionality as a convenience to users even if it would be somewhat inefficient. For example here is a rough start on
A[i,j]
I am less sure how getindex works for
A[:,j]
which is what I really want. Of course I could do this type of overloading myself locally, but why not benefit all users? I've done this in Matlab by the way so I expected it to be even easier in Julia!