Closed martinholters closed 2 years ago
As a quick test, I did
--- a/base/arraymath.jl
+++ b/base/arraymath.jl
@@ -68,7 +68,7 @@ promote_array_type{S<:Integer}(F, ::Type{S}, ::Type{Bool}, T::Type) = T
for f in (:+, :-, :div, :mod, :&, :|, :xor)
@eval ($f)(A::AbstractArray, B::AbstractArray) =
- _elementwise($f, promote_eltype_op($f, A, B), A, B)
+ _elementwise($f, Broadcast._broadcast_type($f, A, B), A, B)
end
function _elementwise(op, ::Type{Any}, A::AbstractArray, B::AbstractArray)
promote_shape(A, B) # check size compatibility
which at least gives
julia> x + x
1-element Array{Int64,1}:
2
for x
as above and it breaks no tests. So in case we decide to keep the current behavior, we should add tests...
Poking around a bit, the code lines for mixed scalar/array cases pointed out by @andreasnoack in https://github.com/JuliaLang/julia/issues/19615#issuecomment-268329094 are more interesting. They are responsible e.g. for
julia> 1.0 + Float32[1.0]
1-element Array{Float32,1}:
2.0
julia> 1.0 + [1]
1-element Array{Float64,1}:
2.0
Preserving the array element type is a no-go for the second example. Blindly using broadcast
would give a Vector{Float64}
in the first case. The question is whether this doubling of memory consumption would be too surprising for the casual user (also see #14252). Consistency with .+
still seems preferable to me, but here, it is maybe not as clear anymore.
Regarding the preservation of the array element type, I believe we should consider the following for a scalar s
and an array A
Doing s ⨳ A
creates a new array applying the ⨳
operation elementwise. The result of each such operation has a type which, in case of being the same for all array elements, should be the eltype
of the new array, or a type wide enough to hold all the values otherwise. For the empty case, use inference. This is the behavior of broadcast
.
If you need to preserve the type of your container you now have many options:
a. Modify your array in place. E.g. Float32[1.0] .+= 1.0
b. In some cases you could use an scalar such that the return type is the same as the array type. Float32[1.0] + 1.f0
c. Use a comprehension. T=eltype(A)
and T[a ⨳ s for a in A]
or [T(a ⨳ s) for a in A]
d. Take advantage of dot fusion. T.(s .⨳ A)
This and the fact that the code would be more easily maintainable, as @martinholters has mentioned before, makes me think that we should follow the route proposed in this issue (point 1 above).
I don't think that Array+Number
can meaningfully be anything but broadcast (or an error, see https://github.com/JuliaLang/julia/pull/5810). *
is different but I'd guess that most operators are similar to +
here. Consequently, the behavior should be exactly that of broadcast
.
I have collected some of the low-hanging fruit in this branch. I probably won't spend time on this during the holiday season, so if anyone wants to pick up from here, feel free. (Note the remarks in the commit regarding Diagonal!)
Edit: I have delete the branch as most of the work there has made it into master, and the work on Diagonal
wasn't really usable as it was.
I think that the performance issues with using .*
etc in Diagonal
should be fixed by #19639?
@stevengj, is that in response to the changes in the mentioned branch? Then no, the problem is that transposition cannot be fused into broadcast
, leading to additional temporaries. (broadcast_At_B
, anyone? :smile:)
Another voice of support for this thread's proposal and the overall direction in the branch linked above. (Re. the first commit on that branch / @stevengj
's comment, ref. #19672.) Best!
This is quite convincing. I think we should just list the typical cases where preserving abstract element types could have been useful (if any), just to be sure we won't regret losing this feature too much.
@martinholters, ah, I see. Once we have some kind of Transpose
type (#6837 or #19670), fusion of transposition won't be a problem.
I don't think that Array+Number can meaningfully be anything but broadcast (or an error, see #5810). * is different but I'd guess that most operators are similar to + here. Consequently, the behavior should be exactly that of broadcast.
I would prefer +
, -
, *
, etc take their linear algebra meanings only (which are scalar*vector
, vector+vector
, and the understanding that arrays of any dimension are still living in a vector space (they are like a tensor), plus we have a variety of matrix/vector multiplications).
So Array + Number
should definitely be an error, IMO (with a nice error message telling people to use .+
).
We already tried deprecating array + scalar
and it was too inconvenient, nor is there a strong mathematical need to restrict the meaning of +
to linear algebra. On the other hand, we are now deprecating most vectorized functions like sin(array)
, so there is greater consistency in deprecating array + scalar
at this point (and potential performance advantages, but this is limited by the fact that we won't deprecate array + array
or array * scalar
).
The Diagonal
multiplication is actually an interesting case to study. E.g.
(*)(A::AbstractMatrix, D::Diagonal) =
scale!(similar(A, promote_op(*, eltype(A), eltype(D.diag))), A, D.diag)
Here, similar(A, T)
to preserve the container type of A
(if possible) makes a lot of sense, but using promote_op
to determined the element type T
leads to the discussed inconsistencies with broadcast
behavior. But OTOH, using broadcast
to obtain an Array
of the correct element type and converting to the desired final return type by combing container type of A
and this element type sounds prohibitively wasteful.
Now _broadcast
actually supports broadcasting into an AbstractArray
(using similar
to update the element type if needed), but the wrappers broadcast_t
and broadcast_c
that do a lot of setup work are restricted to Array
s.
What might be very useful here and probably elsewhere would be a function that combines the element type computation of broadcast
with the possibility to give a container D
similar to the destination container for broadcast!
, but only using it as the first argument to similar
to obtain the output container. Does that sound reasonable? (And what should the function be called?)
That would allow
(*)(A::AbstractMatrix, D::Diagonal) = function_to_be_named(*, A, A, D.diag.')
Couldn't some of this logic go into broadcast_c
and broadcast_t
? It seems wrong to me that these explicitly use Array
.
Certainly. Replacing similar(Array{T}, shape)
with similar(newly_introduced_parameter, T, shape)
would probably go a long way of getting the desired functionality. I just wanted to to see whether this was deemed useful/desirable before cooking up an actual design/implementation.
starting to sound a bit like #16740 again
A bit like it, but different in two aspects: The newly_introduced_parameter
could be a (container) type or an instance; not sure which is better or whether both should actually be allowed. Further, the element type of the given container (type) would always be ignored, while it would be used in #16740 if set.
The OP doesn't use promote_op anymore, but the discussion seems to have strayed a bit from there. Is there anything left here, or has everything mentioned here eventually covered by the implementation and changes surrounding broadcast?
I don't know whether to open another issue for this, so I am commenting here: the docstring of promote_op
explicitly discourages its use:
Due to its fragility, use of
promote_op
should be avoided. It is preferable to base the container eltype on the type of the actual elements. Only in the absence of any elements (for an empty result container), it may be unavoidable to callpromote_op
.
while the design pattern section encourages it:
https://docs.julialang.org/en/v1.4-dev/manual/methods/#Output-type-computation-1
I think this is confusing.
We can perhaps never quite eliminate that discrepancy, but (as mentioned above), seem to be in a decent place now with it
At the moment, the result's element type of
broadcast
is determined in a straight-forward way bytypejoin
ing the type of all the result's elements. (Exception: The empty case, where inference is invoked.) However, for certain operations, the more elaborate mechanism ofpromote_op
is used, which especially aims at doing magical stuff if the input arrays have non-leaf element types. E.g.While the behavior of
+
here does look useful, I wonder whether it is worth the inconsistency and the effort. I tend to say no, but before starting any real work on the issue, I'd like some feedback.