JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.65k stars 5.48k forks source link

Missing fallback methods for vector spaces #43903

Closed juliohm closed 2 years ago

juliohm commented 2 years ago

Given a vector space, the division by a non-zero scalar can be defined in terms of the scalar product:

/(v::SomeVectorType, λ::Number) = inv(λ) * v # where * is the implemented scalar product

Can this fallback method be added to Base? Currently we need to implement the fallback for all new vector spaces.

juliohm commented 2 years ago

Another operation that needs a fallback:

-(u::SomeVectorType, v::SomeVectorType) = u + (-v)
-(v::SomeVectorType) = -1 * v

@dkarrasch can I submit a PR? What is the file where these methods should go?

dkarrasch commented 2 years ago

Actually, we do have fallbacks. We have

-(A::AbstractArray) = broadcast_preserving_zero_d(-, A)

i.e, apply - elementwise, and

for f in (:+, :-)
    @eval function ($f)(A::AbstractArray, B::AbstractArray)
        promote_shape(A, B) # check size compatibility
        broadcast_preserving_zero_d($f, A, B)
    end
end

i.e., take the difference elementwise. As for the division, we have

for f in (:/, :\, :*)
    if f !== :/
        @eval ($f)(A::Number, B::AbstractArray) = broadcast_preserving_zero_d($f, A, B)
    end
    if f !== :\
        @eval ($f)(A::AbstractArray, B::Number) = broadcast_preserving_zero_d($f, A, B)
    end
end

So, if your vector space operations should not fall back to broadcasting, then I guess this is exactly a case where you need to overload and specialize these operations for your own types? Your issue reminds me of https://github.com/Jutho/TensorKit.jl.

juliohm commented 2 years ago

@dkarrasch consider the following vector space:

import Base: *, +

# functional data type
struct Fun
    data
end

# scalar multiplication
*(λ::Number, f::Fun) = Fun(λ * f.data)

# vector addition
+(f::Fun, g::Fun) = Fun(f.data + g.data)

By default I get the following errors:

julia> Fun([1,2,3]) - Fun([4,5,6])
ERROR: MethodError: no method matching -(::Fun, ::Fun)
Stacktrace:
 [1] top-level scope
   @ REPL[6]:1

julia> Fun([1,2,3]) / 2.0
ERROR: MethodError: no method matching /(::Fun, ::Float64)
Closest candidates are:
  /(::StridedArray{P}, ::Real) where P<:Dates.Period at ~/Desktop/julia-1.7.2/share/julia/stdlib/v1.7/Dates/src/deprecated.jl:44
  /(::Union{SparseArrays.SparseVector{Tv, Ti}, SubArray{Tv, 1, <:SparseArrays.AbstractSparseMatrixCSC{Tv, Ti}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, false}, SubArray{Tv, 1, <:SparseArrays.AbstractSparseVector{Tv, Ti}, Tuple{Base.Slice{Base.OneTo{Int64}}}, false}} where {Tv, Ti}, ::Number) at ~/Desktop/julia-1.7.2/share/julia/stdlib/v1.7/SparseArrays/src/sparsevector.jl:1476
  /(::LinearAlgebra.Bidiagonal, ::Number) at ~/Desktop/julia-1.7.2/share/julia/stdlib/v1.7/LinearAlgebra/src/bidiag.jl:375
  ...
Stacktrace:
 [1] top-level scope
   @ REPL[7]:1
dkarrasch commented 2 years ago

Sure, but you're not subtyping to anything. Of course you need to teach Julia every single operation for Fun (whose data field is also unparametrized BTW). Or do you want to define -(x::Any) based on (-1)*x, where you silently assume that *(::Number, ::Any) is defined? I think we need a meaningful "upper type bound" that is strictly lower than Any. Can you make Fun <: AbstractArray? Then you need to define size and getindex and the rest should pretty much work out of the box.

juliohm commented 2 years ago

@dkarrasch this is an example of an infinite dimensional vector space. The idea of working with coordinates (i.e. size, and getindex) shouldn't be enforced on new vector space types.

Or do you want to define -(x::Any) based on (-1)*x, where you silently assume that *(::Number, ::Any) is defined?

Wouldn't that make sense? What other definition could we assign to -(::Any)? The definition of *(::Number, ::Any) is not necessary since we already have it explicitly defined in new vector spaces for a specific type. In my example above, I explicitly defined *(::Number, ::Fun)

whose data field is also unparametrized BTW

Yep, that was a quick simple example to illustrate the point.

juliohm commented 2 years ago

What I mean here is that in order to implement a new vector space as we do in math, we only need to define two operations. The rest follows from proofs. We can prove that -x = -1 * x where 1 is the unit in the field of scalars, and so on.

dkarrasch commented 2 years ago

I understand. I guess we need others to chime in (@Jutho @MasonProtter @andreasnoack @ViralBShah @dlfivefifty come to mind). It feels strange to me to assume that Any is by default a vector space element. Your approach really reminds of the abstract generic approach taken in TensorKit.jl. I don't have strong feelings either way, it's just that Any is very very generic.

juliohm commented 2 years ago

I sympathize with your concern, but can't think of a more natural fallback. Most of us are doing linear algebra 24/7 and this fallback would simplify things tremendously for teaching students and for writing more generic code.

juliohm commented 2 years ago

If the generic fallback -(x::Any) is not acceptable (I think it should be ok and harmless to add), then we could have a new abstract type in LinearAlgebra to represent vectors as in vector spaces, not vectors as in 1-dimensional arrays of numbers.

Any update on this issue?

MasonProtter commented 2 years ago

I think my opinion on this is that the lack of upper bounds makes me a big nervous, but on the other hand, one should ask

What else could something like -x or v / λ possibly mean than the meaning suggested here?

I struggle to imagine an alternative meaning that we wouldn't consider a 'pun'. For /, the docstring is very explicit that this is the exact meaning we assign to /:

help?> /
search: / //

  /(x, y)

  Right division operator: multiplication of x by the inverse of y on the right. Gives
  floating-point results for integer arguments.

I think given this, if someone wanted to make a method a::T / b::Number be something semantically different than a * inv(b), then they're not really respecting the intended meaning of the / function anyways and should define their own function with its own methods.

I feel similarly for -, which simply just claims to mean subtraction, and subtraction is generally defined as the inverse of addition, so having a fallback for subtraction that uses addition with the additive inverse makes sense to me.

The more I think about this, the more I think my only real hesitance is fear of additional method ambiguities.

MasonProtter commented 2 years ago

Generally speaking though, I think this sort of thing is why we want proper support for traits. Ultimately, we have bundled up all these math methods on the premise that if something is structurally a number or array, then it obeys the rules of a vector space, but really what we want is to be able to say

"I have a type, and this type obeys the rules of vector space arithmetic" without forcing the user to subtype AbstractArray or Number, which carry a ton of additional, inappropriate baggage for certain use cases.

DilumAluthge commented 2 years ago

It feels strange to me to assume that Any is by default a vector space element.

I agree with this. Why should Julia make such an assumption? If you're defining a new vector space element, you should say so explicitly, either by subtyping or traits.

MasonProtter commented 2 years ago

I would say it's not about Any at all. It's about the - and / functions who explicitly say in their docstrings that they are to be defined in this sense.

juliohm commented 2 years ago

without forcing the user to subtype AbstractArray or Number, which carry a ton of additional, inappropriate baggage for certain use cases.

Exactly. That is the crux of the issue.

The proposed fallback methods for / and - are basically the definition of these operations for any algebraic structure.

juliohm commented 2 years ago

However, I feel that the docstring of /:

multiplication of x by the inverse of y on the right

could be generalized to:

multiplication of the inverse of y by x on the right

I would say that the more general case also flips the order of the arguments so that we only rely on *(::Number, ::Any) and never on (::Any, ::Number). That is:

/(x, y) = inv(y) * x

In the above expression we assume that x is a vector (not necessarily an array) in a vector space and y is a scalar.

juliohm commented 2 years ago

I added a new tutorial with the example I had in mind. The Pluto notebook has a hidden cell with the fallback methods implemented manually: https://www.youtube.com/watch?v=aQSlPDzqGPY

It would be nice to remove this hidden cell at some point in the future.

dkarrasch commented 2 years ago

One possible way forward is to add those generic methods in a PR and let nanosoldier run, to see if any issues in the package ecosystem occur. If they don't, then indeed why not make those definitions.

Jutho commented 2 years ago

@dkarrasch, I don't think TensorKit.jl is relevant here, but maybe KrylovKit.jl is. Here, I also want to support general user types that have the mathematical interpretation of being vectors in a vector space, rather than restricting to AbstractVector or AbstractArray as many other packages seem to be doing (even major packages as IterativeSolvers.jl, Optim.jl, DifferentialEquations.jl, ...). I find it somewhat unfortunate and discomforting that in a flexible and generic language as Julia, people tend to take over the habits from older languages and require things to be AbstractArray when this is not strictly necessary.

As stated by others, a common supertype cannot solve this. I think I learned early on when I started using Julia that the type system is not meant to express mathematical structures (as this would anyway be impossible without being able to have multiple supertypes). A trait system could help here, but for now there is only the possibility of having an informal interface (as e.g. with the iterator interface), i.e. requiring that a number of methods are implemented.

I thus fully agree that it would be useful to have an agreed upon specification of a vector interface in the Julia base library. In KrylovKit.jl, the methods that I request are documented in point 2 of: https://jutho.github.io/KrylovKit.jl/stable/#Package-features-and-alternatives

So for example, in the KrylovKit code, I will make sure to never write -v but rather (-1)*v.

Note that, on top of the minimal set of methods that appear in the mathematical definition of vectors (vector addition and multiplying with scalars), it is in practice useful to have in-place methods to do these operations. If I were to propose what such a minimal vector interface would need to have, it would be the following:

On top of those, it is probably useful to be able to extract the scalar type that is used by those vectors. From the mathematical perspective, that's probably real numbers or complex numbers (or more exotic), but in practice you wan't to know which specific implementation (e.g. Float32 or Float64). For Arrays, this corresponds to eltype, but eltype in Julia is supposed to return the type if you index into that variable, or if it does not support indexing, if you iterate over it. But this can be very different from the actual scalar type. Nested arrays would be a primary example here. In KrylovKit, I don't rely on a scalartype method, though this would make things more useful. Instead, I use implicit tricks like computing the inner product of two vectors to extract the scalar type, which is clearly suboptimal.

Then, finally, a method to create a similar vector, possibly with a different (promoted) scalar type. Something analogous to similar(v::Array, eltype::T) but which works with the aforementioned scalar type, rather than eltype. In KrylovKit.jl, this is currently solved by letting the out-of-place multiplication handle allocation and scalar type promotion. So if I have a vector v and I want to create a similar vector with a scalar type T, I would write one(T) * v or zero(T) * v. That is again suboptimal, as in fact the resulting scalar type can be different, namely promote_type(T, scalartype(v)). It's only because I already know that this yields T because of how T was determined that this works in my case.

Optionally, there would be dot (I would have liked inner) and norm functions for vectors in an inner product space. Not all vectors need to have those, but for those where they can be defined, I think there is not much debate about how the interface should look like.

dlfivefifty commented 2 years ago

/(x, y) = inv(y) * x

is wrong:


julia> A/B
3×3 Matrix{Float64}:
 0.834777  0.328324  -0.335083
 0.538943  0.151829  -0.952223
 0.785972  2.19402   -0.826465

julia> A*inv(B)
3×3 Matrix{Float64}:
 0.834777  0.328324  -0.335083
 0.538943  0.151829  -0.952223
 0.785972  2.19402   -0.826465

julia> inv(B)*A
3×3 Matrix{Float64}:
  0.229277   0.440463  -1.20529
 -1.29909   -0.948649  -0.213654
  1.20497    1.4632     0.879513
juliohm commented 2 years ago

Yes, perhaps we need a distinction between these two cases, or we can simply ignore the / for vector spaces. The main concern is the fallback for - at the moment.

juliohm commented 2 years ago

Is it reasonable to consider the fallback for - in Julia v1.8?

-(u, v) = u + (-v)
-(v) = -1 * v

I fully agree @Jutho that we need a better treatment of spaces in the future, probably as a package with a well-defined API for the field of scalars, etc. But these fallbacks would solve a lot of headache already.

N5N3 commented 2 years ago

Is there a strong reason to put them in Base? *// are generic functions, user can map anything to them with their own types.

Currently we need to implement the fallback for all new vector spaces.

Limiting these transform to, e.g. AbstractVectorSpaces, seems more reasonable?

juliohm commented 2 years ago

We discussed the reasons for these fallbacks above and most people agreed that this is a reasonable definition for - for pretty much any space. It would allow users to define their own spaces with a minimum set of specializations like it is done in math.

Em seg., 28 de fev. de 2022 04:17, N5N3 @.***> escreveu:

Is there a strong reason to put them in Base? *// are genetic functions, user can map anything to them with their own types. Limiting these transform to, e.g. AbstractVectorSpaces, seems more reasonable?

— Reply to this email directly, view it on GitHub https://github.com/JuliaLang/julia/issues/43903#issuecomment-1053958738, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZQW3LHLQIPDFNLH2D6JPDU5MOR7ANCNFSM5MTHMHRQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

dlfivefifty commented 2 years ago

You probably mean: (-one(v)) * v but if typeof(one(v)) == typeof(v) this introduces an infinite call.

juliohm commented 2 years ago

No. I mean -(v) = -1 * v where the argument is assumed an element of a vector space. There is no such thing as one(v) in this context.

Em seg., 28 de fev. de 2022 12:12, Sheehan Olver @.***> escreveu:

You probably mean: (-one(v)) * v but if typeof(one(v)) == typeof(v) this introduces an infinite call.

— Reply to this email directly, view it on GitHub https://github.com/JuliaLang/julia/issues/43903#issuecomment-1054355315, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZQW3PW2TFQACWWSZYQ7STU5OGFVANCNFSM5MTHMHRQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

juliohm commented 2 years ago

@dkarrasch where we could add these fallbacks for -? I am happy to submit a PR to check the new method in the existing test sets.

juliohm commented 2 years ago

@dkarrasch I am happy to submit a PR with the fallback for -. I really wish it could make it to Julia v1.8.

KristofferC commented 2 years ago

I really wish it could make it to Julia v1.8.

The feature freeze for 1.8 was a long time ago so that is not possible.

ParadaCarleton commented 2 years ago

Another operation that needs a fallback:

-(u::SomeVectorType, v::SomeVectorType) = u + (-v)
-(v::SomeVectorType) = -1 * v

@dkarrasch can I submit a PR? What is the file where these methods should go?

I'd like to add that this fallback would have saved us from a bug in Distributions.jl that we just discovered, where we forgot to define a method for - being called on multivariate distributions. We have to define - by hand for many different subtypes (reals, arrays, ...) in a way that makes it easy to miss a particular combination.