Reconsider uses of promote_op

martinholters commented 7 years ago

At the moment, the result's element type of broadcast is determined in a straight-forward way by typejoining the type of all the result's elements. (Exception: The empty case, where inference is invoked.) However, for certain operations, the more elaborate mechanism of promote_op is used, which especially aims at doing magical stuff if the input arrays have non-leaf element types. E.g.

julia> x = ones(Real, 1)
1-element Array{Real,1}:
 1

julia> x .+ x
1-element Array{Int64,1}:
 2

julia> x + x
1-element Array{Real,1}:
 2

While the behavior of + here does look useful, I wonder whether it is worth the inconsistency and the effort. I tend to say no, but before starting any real work on the issue, I'd like some feedback.

martinholters commented 7 years ago

As a quick test, I did

--- a/base/arraymath.jl
+++ b/base/arraymath.jl
@@ -68,7 +68,7 @@ promote_array_type{S<:Integer}(F, ::Type{S}, ::Type{Bool}, T::Type) = T

 for f in (:+, :-, :div, :mod, :&, :|, :xor)
     @eval ($f)(A::AbstractArray, B::AbstractArray) =
-        _elementwise($f, promote_eltype_op($f, A, B), A, B)
+        _elementwise($f, Broadcast._broadcast_type($f, A, B), A, B)
 end
 function _elementwise(op, ::Type{Any}, A::AbstractArray, B::AbstractArray)
     promote_shape(A, B) # check size compatibility

which at least gives

julia> x + x
1-element Array{Int64,1}:
 2

for x as above and it breaks no tests. So in case we decide to keep the current behavior, we should add tests...

martinholters commented 7 years ago

Poking around a bit, the code lines for mixed scalar/array cases pointed out by @andreasnoack in https://github.com/JuliaLang/julia/issues/19615#issuecomment-268329094 are more interesting. They are responsible e.g. for

julia> 1.0 + Float32[1.0]
1-element Array{Float32,1}:
 2.0

julia> 1.0 + [1]
1-element Array{Float64,1}:
 2.0

Preserving the array element type is a no-go for the second example. Blindly using broadcast would give a Vector{Float64} in the first case. The question is whether this doubling of memory consumption would be too surprising for the casual user (also see #14252). Consistency with .+ still seems preferable to me, but here, it is maybe not as clear anymore.

pabloferz commented 7 years ago

Regarding the preservation of the array element type, I believe we should consider the following for a scalar s and an array A

Doing s ⨳ A creates a new array applying the ⨳ operation elementwise. The result of each such operation has a type which, in case of being the same for all array elements, should be the eltype of the new array, or a type wide enough to hold all the values otherwise. For the empty case, use inference. This is the behavior of broadcast.
If you need to preserve the type of your container you now have many options: a. Modify your array in place. E.g. Float32[1.0] .+= 1.0 b. In some cases you could use an scalar such that the return type is the same as the array type. Float32[1.0] + 1.f0 c. Use a comprehension. T=eltype(A) and T[a ⨳ s for a in A] or [T(a ⨳ s) for a in A] d. Take advantage of dot fusion. T.(s .⨳ A)

This and the fact that the code would be more easily maintainable, as @martinholters has mentioned before, makes me think that we should follow the route proposed in this issue (point 1 above).

andreasnoack commented 7 years ago

I don't think that Array+Number can meaningfully be anything but broadcast (or an error, see https://github.com/JuliaLang/julia/pull/5810). * is different but I'd guess that most operators are similar to + here. Consequently, the behavior should be exactly that of broadcast.

martinholters commented 7 years ago

I have collected some of the low-hanging fruit in this branch. I probably won't spend time on this during the holiday season, so if anyone wants to pick up from here, feel free. (Note the remarks in the commit regarding Diagonal!)

Edit: I have delete the branch as most of the work there has made it into master, and the work on Diagonal wasn't really usable as it was.

stevengj commented 7 years ago

I think that the performance issues with using .* etc in Diagonal should be fixed by #19639?

martinholters commented 7 years ago

@stevengj, is that in response to the changes in the mentioned branch? Then no, the problem is that transposition cannot be fused into broadcast, leading to additional temporaries. (broadcast_At_B, anyone? :smile:)

Sacha0 commented 7 years ago

Another voice of support for this thread's proposal and the overall direction in the branch linked above. (Re. the first commit on that branch / @stevengj's comment, ref. #19672.) Best!

nalimilan commented 7 years ago

This is quite convincing. I think we should just list the typical cases where preserving abstract element types could have been useful (if any), just to be sure we won't regret losing this feature too much.

stevengj commented 7 years ago

@martinholters, ah, I see. Once we have some kind of Transpose type (#6837 or #19670), fusion of transposition won't be a problem.

andyferris commented 7 years ago

I don't think that Array+Number can meaningfully be anything but broadcast (or an error, see #5810). * is different but I'd guess that most operators are similar to + here. Consequently, the behavior should be exactly that of broadcast.

I would prefer +, -, *, etc take their linear algebra meanings only (which are scalar*vector, vector+vector, and the understanding that arrays of any dimension are still living in a vector space (they are like a tensor), plus we have a variety of matrix/vector multiplications).

So Array + Number should definitely be an error, IMO (with a nice error message telling people to use .+).

stevengj commented 7 years ago

We already tried deprecating array + scalar and it was too inconvenient, nor is there a strong mathematical need to restrict the meaning of + to linear algebra. On the other hand, we are now deprecating most vectorized functions like sin(array), so there is greater consistency in deprecating array + scalar at this point (and potential performance advantages, but this is limited by the fact that we won't deprecate array + array or array * scalar).

martinholters commented 7 years ago

The Diagonal multiplication is actually an interesting case to study. E.g.

(*)(A::AbstractMatrix, D::Diagonal) =
    scale!(similar(A, promote_op(*, eltype(A), eltype(D.diag))), A, D.diag)

Here, similar(A, T) to preserve the container type of A (if possible) makes a lot of sense, but using promote_op to determined the element type T leads to the discussed inconsistencies with broadcast behavior. But OTOH, using broadcast to obtain an Array of the correct element type and converting to the desired final return type by combing container type of A and this element type sounds prohibitively wasteful. Now _broadcast actually supports broadcasting into an AbstractArray (using similar to update the element type if needed), but the wrappers broadcast_t and broadcast_c that do a lot of setup work are restricted to Arrays. What might be very useful here and probably elsewhere would be a function that combines the element type computation of broadcast with the possibility to give a container D similar to the destination container for broadcast!, but only using it as the first argument to similar to obtain the output container. Does that sound reasonable? (And what should the function be called?)

That would allow

(*)(A::AbstractMatrix, D::Diagonal) = function_to_be_named(*, A, A, D.diag.')

andreasnoack commented 7 years ago

Couldn't some of this logic go into broadcast_c and broadcast_t? It seems wrong to me that these explicitly use Array.

martinholters commented 7 years ago

Certainly. Replacing similar(Array{T}, shape) with similar(newly_introduced_parameter, T, shape) would probably go a long way of getting the desired functionality. I just wanted to to see whether this was deemed useful/desirable before cooking up an actual design/implementation.

tkelman commented 7 years ago

starting to sound a bit like #16740 again

martinholters commented 7 years ago

A bit like it, but different in two aspects: The newly_introduced_parameter could be a (container) type or an instance; not sure which is better or whether both should actually be allowed. Further, the element type of the given container (type) would always be ignored, while it would be used in #16740 if set.

vtjnash commented 6 years ago

The OP doesn't use promote_op anymore, but the discussion seems to have strayed a bit from there. Is there anything left here, or has everything mentioned here eventually covered by the implementation and changes surrounding broadcast?

tpapp commented 5 years ago

I don't know whether to open another issue for this, so I am commenting here: the docstring of promote_op explicitly discourages its use:

Due to its fragility, use of promote_op should be avoided. It is preferable to base the container eltype on the type of the actual elements. Only in the absence of any elements (for an empty result container), it may be unavoidable to call promote_op.

while the design pattern section encourages it:

https://docs.julialang.org/en/v1.4-dev/manual/methods/#Output-type-computation-1

I think this is confusing.

vtjnash commented 2 years ago

We can perhaps never quite eliminate that discrepancy, but (as mentioned above), seem to be in a decent place now with it

JuliaLang / julia

Reconsider uses of promote_op #19669