Lazy addition? - Githubissues

tkf commented 5 years ago

Are you planning to implement lazy addition Add(A, B) similar to Mul(A, B)? By that I mean a lazy object such that mul!(y, Add(A, B), x, α, β) lowers into

lmul!(β, y)
mul!(y, A, x, α, 1)
mul!(y, B, x, α, 1)

If you are OK with it, I think I can write up a PR.

(In fact, I've written something like this a while ago https://github.com/tkf/MatrixCombinators.jl but I thought I'd better contribute to this package since it's much more feature complete.)

dlfivefifty commented 5 years ago

I don't think it's needed as we already have BroadcastArray(+, A, B).

tkf commented 5 years ago

Nice. Does mul!(y, BroadcastArray(+, A, B), x, α, β) lower to the non-allocating combination of mul! as I wrote above?

dlfivefifty commented 5 years ago

They way I would support that is via the syntax:

y .= α .* Mul(A .+ B, x) .+ β .* y

This could by made to lower to the non-allocating version by having MemoryLayout(A .+ B) return a custom type.

mul! is not the preferred syntax as it doesn't scale well for many different matrix types. Mul avoids this pitfall by dispatching on MemoryLayout, not matrix type.

tkf commented 5 years ago

OK. I guess shouldn't be referring to the implementation too much at this point. But I suppose that there is no direct support for

lmul!(β, y)
mul!(y, A, x, α, 1)
mul!(y, B, x, α, 1)

at the moment in LazyArrays.jl but you agree that it's nice to have?

tkf commented 5 years ago

BTW, IIUC, I don't think we can use y .= α .* Mul(A .+ B, x) .+ β .* y for lazy evaluation since A .+ B would be materialized before Mul gets it. Am I correct? (I haven't played with broadcasting much yet):

julia> Meta.lower(Main, :(y .= α .* Mul(A .+ B, x) .+ β .* y))
:($(Expr(:thunk, CodeInfo(
 1 ─ %1 = (Base.broadcasted)(+, A, B)
 │   %2 = (Base.materialize)(%1)
 │   %3 = Mul(%2, x)
 │   %4 = (Base.broadcasted)(*, α, %3)
 │   %5 = (Base.broadcasted)(*, β, y)
 │   %6 = (Base.broadcasted)(+, %4, %5)
 │   %7 = (Base.materialize!)(y, %6)
 └──      return %7
))))

dlfivefifty commented 5 years ago

Yes you are right. We can use BroadcastArray since it forces it to not materialize. We could even make a type alias Add so this would become:

y .= α .* Mul(Add(A , B), x) .+ β .* y

Note that this is lowered to the following, which perhaps you prefer:

materialize!(MulAdd(α, Add(A,B), x, β, y))

tkf commented 5 years ago

Got it, thanks for the explanation. I'll be trying to write it (unless you can do it in 10 minutes or something).

dlfivefifty commented 5 years ago

It would take a bit more than 10 minutes but the steps are roughly:

Add a memory layout for broadcast arrays:

struct BroadcastLayout{F, LAY} <: MemoryLayout
f::F
layouts::LAY
end
MemoryLayout(A::BroadcastArray) = BroadcastLayout(A.broadcasted.f, MemoryLayout.(A.broadcasted.args))

Overload

function materialize!(M::MulAdd{<:BroadcastLayout{typeof(+)}})
lmul!(M.β, M.y)
for A in M.A.broadcasted.args
    M.y .= M.α .* Mul(A, M.x) .+ M.y
end
M.y
end

If you prefer a more explicit style, the main line could look like materialize!(MulAdd(M.α, M.A, M.x, one(T), M.y)).

This should actually support more than two additions.

dlfivefifty commented 5 years ago

By the way, I consider this package still "experimental" and am happy to redesign things if there are any good suggestions. The current design is a bit esoteric but there was good reasons for it, and its proven very successful in BlockBandedMatrices.jl.

tkf commented 5 years ago

Thanks for the guideline! I was looking at the code and my answer so far was

const Add = BroadcastArray{<:Any, <:Any, <:Broadcasted{<:Any, <:Any, typeof(+)}}

function default_blasmul!(α, A::AbstractMatrix, B::Add, β, C::AbstractVecOrMat)
    if iszero(β)
        fill!(C, 0)
    else
        lmul!(β, C)
    end
    for b in B.broadcasted.args
        default_blasmul!(α, A, B, true, C)
    end
    return C
end

I should learn how MemoryLayout works!

If you prefer a more explicit style

I like M.y .= M.α .* Mul(A, M.x) .+ M.y. I was referring to mul! because mul! would work with code not aware of LazyArrays.jl.

redesign things if there are any good suggestions

I'm new to this library so I better be learning "LazyArrays.jl way" for a while. But I'd suggest things anyway :) (1) I think it still would be nice to provide an "adapter" interface for LinearAlgebra's five-argument mul! (or what ever would it be called after https://github.com/JuliaLang/julia/issues/23919) to be composable with the rest of ecosystem. (2) Infix Unicode operators for Mul and Add would be nice.

dlfivefifty commented 5 years ago

I was referring to mul! because mul! would work with code not aware of LazyArrays.jl.

In the currently tagged version it's the other way around: y .= Mul(A,x) would default to mul!(y, A, x), so it was more general.

The catch is that it's better to fall back on the 5-argument version, but there is no 5-argument mul! in LinearAlgebra yet. So when there is one, this would be the default backup.

think it still would be nice to provide an "adapter" interface for LinearAlgebra's five-argument mul!

LinearAlgebra doesn't have a 5-argument mul! yet. The 3-argument mul! is supported by the @lazymul command: @lazymul MyArrayType will cause mul! to use the lazy arrays version.

But this is an easy fix to support the 5-argument one by just adding it to the @lazymul macro, we should do this.

Infix Unicode operators for Mul and Add would be nice.

Yes! The only reason this hasn't been added is its good to get the "right" syntax. I'm not sure what the "right" one is.

tkf commented 5 years ago

What I mean by compositionality is something like the following. Suppose that I wrote some code before knowing LazyArrays.jl:

function f(du, u, A, t)
    du .= .- u .+ tanh.(mul!(du, A, u))
end

A = sprand(100, 100, 0.1)
ode = ODEProblem(f, u0, tspan, A)

If LazyArrays.jl exposes mul!-compatible interface, I can just use the lazy array instead of Matrix and get a performance boost right away without touching my code:

B = -0.1 * Eye(100)
M = Add(A, B)  # This is a highly structured matrix so I don't want to materialize it.
ode = ODEProblem(f, u0, tspan, M)

Of course, since it's my code, I can rewrite my function f to use LazyArrays.jl. But now imagine that function f is some more complicated function deep in the library I'm using. Then there is no way I can use LazyArrays.jl API unless I fork this library.

@lazymul MyArrayType will cause mul! to use the lazy arrays version.

Great. So why not just do

const Add = BroadcastArray{<:Any, <:Any, <:Broadcasted{<:Any, <:Any, typeof(+)}}
@lazymul Add{<:Any, 2}

?

dlfivefifty commented 5 years ago

Add should be fine. The issue comes if you want to support adjoints or views (or views of adjoints of transposes of symmetric of ...). Mul is supported efficiently for any combination. It’s impossible to support mul! for all combinations because they dispatch on types, not traits.

I’ve sent a message to @mbauman to discuss the long term plan for mul! and traits, I’ll let you know. But for the packages where I need efficient subarrays and adjoints, I’m just going to stick to LazyArrays.jl approach for now.

Note there’s currently a debate in BlockArrays.jl whether to use LazyArrays.jl https://github.com/JuliaArrays/BlockArrays.jl/issues/31

tkf commented 5 years ago

Thanks for being open for suggestion. But I'm still puzzled with

It’s impossible to support mul! for all combinations because they dispatch on types, not traits.

Wouldn't just defining

mul!(C::AbstractVecOrMat, A::Mul, B::AbstractVecOrMat) = (C .= Mul(A, B); C)

provide a mul! implementation as efficient as C .= Mul(A, B)? I'm only suggesting to expose mul! interface, not asking to rewrite the internal based on mul!.

Note there’s currently a debate in BlockArrays.jl whether to use LazyArrays.jl JuliaArrays/BlockArrays.jl#31

Thanks, I actually was about to ask this next :)

tkf commented 5 years ago

Actually, I found that MulMatrix is what I'm asking. But it seems it looks like mul! does not call the lazy version? Why not @lazymul MulMatrix?

tkf commented 5 years ago

Let me reply to your comment https://github.com/JuliaArrays/LazyArrays.jl/pull/9#discussion_r236065916 here (as it's not implementation-specific)

But I'm not convinced Add should exist at all: what about Minus?

I thought about it, but Minus can be done just by Add(A, Mul(-1, B)), provided that Mul implements it. I think Add is a fundamental building block from which (together with Mul) you can bootstrap a lot of expressions.

Is Add(A,B) that much better than, broadcasted(+, A, B), which also forces it to not materialize?

I'm OK with writing broadcasted(+, A, B) but what I absolutely cannot do without type piracy is to define materialize!(M::MulAdd{<:BroadcastLayout{typeof(+)}}) and @lazymul AddMatrix. So I think PR #9 is worth having it. Also, if you have such specialization, I think it makes sense to have an alias to signify that this particular construct has an optimized method.

dlfivefifty commented 5 years ago

I think PR #9 is definitely worth having, the question is whether it's restricted to +. I'm happy with whatever you decide, let's just not export Add for now.

tkf commented 5 years ago

Great. Thanks a lot for merging #9 and all the discussion! I still want an easier API/alias but I guess that would be #10.

JuliaArrays / LazyArrays.jl

Lazy addition? #8