Open charleskawczynski opened 1 year ago
This comes from https://github.com/JuliaArrays/BlockArrays.jl/blob/ec8c7f554bf57f06b85d640c6fa7dcfd5fa9b5c3/src/blockbroadcast.jl#L37 Perhaps we may remove the special-casing, as the performance impact in the 1-term union will be minimal
Yeah, I tried fixing this. I'll open the PR for convenience (#312). First, I thought that the core issue was that union
is used in sortedunion
, and union
does not preserve tuple types:
julia> union((1,), (2,))
2-element Vector{Int64}:
1
2
However, even after fixing that in the PR (which borrows some functions in TupleTools.jl and defines union
on Tuples to return a Tuple), the result is still type unstable because the tuple length depends on the values in the tuple:
So, while this could fix the type preservation, it won't actually fix the type instability. If you have ideas about how to make Broadcast.broadcast_shape
type stable / inferrable, that'd be great
I think it's better for this to return a vector, as the non-trivial branch does. Since unique
works on values, the length can't be known at compile time. I wonder if we may just remove the if-else here? I haven't checked if this changes the results
Yeah, that does fix it. I guess it's worth it. The general case of calling combine_blockaxes
is either going to be type unstable in the case of Tuples (leading to inference triggers), or allocating heap allocating from union(a,b)
. Right now, it's a mixture, so it allocates and it's type-unstable.
I'll update the PR
Ok, the PR is updated.
Thanks for the tip @jishnub!
Actually, I prefer #313 if that's okay 🙂
So, it seems that length(b) == 1 ? a
is needed for upstream tests to pass.
I updated https://github.com/JuliaArrays/BlockArrays.jl/pull/313, and basically specialized it so that the tests pass, but I have a feeling that I'm introducing inconsistent behavior between blocklasts()::Tuple
and blocklasts()::Vector
cases.
I think one potential solution would be to put the length
of BlockedUnitRange
in the type space:
struct BlockedUnitRange{CS,L} <: AbstractUnitRange{Int}
first::Int
lasts::CS
global _BlockedUnitRange(f, cs::CS) where CS = new{CS, len(f, cs)}(f, cs)
end
len(f, l) = isempty(l) ? 0 : Integer(last(l)-f+1)
...
length(a::BlockedUnitRange{CS, L}) where {CS, L} = L
which will make the call to length
(in combine_blockaxes
) compile-time known, and remove the type instability. However, that may cause other issues. And I'm now seeing that things like DefaultBlockAxis
may make this change more intrusive.
Thoughts? jishnub
I wonder if a simple workaround might be to convert the return value of axistype
to a Vector
?
I wonder if a simple workaround might be to convert the return value of axistype to a Vector?
Absolutely not: one of the usages downstream are infinite dimensional block array, and even without that we want to support allocation-free axes when the block sizes are Fill
.
Note this really is an issue inherited with Base's broadcasting: they never should have supported degenerate sizes like randn(5,1) .* randn(5,6)
, and restricted broadcasting to randn(5) .* randn(5,6)
. Similarly only support randn(5,6) .* transpose(randn(6))
.
We could potentially disallow degenerate broadcasting for blocked arrays. We'd have to see what tests fail downstream.
Agreed, the main reason for fixing this issue is to avoid allocations. So ideally we have a solution that is fully inferred and stack allocated.
MWE: