JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.46k stars 5.46k forks source link

Extra allocations with hvcat of mixed arrays and scalars #39713

Open BioTurboNick opened 3 years ago

BioTurboNick commented 3 years ago

On 1.6.0-rc1:

const a, b, c, d = zeros(Int, 2, 2), [3 4], [2 ; 4], 5
using BenchmarkTools
# mixed arrays and scalars
@btime [a c ; b d]   # 31 allocations and 1.25 kb -- uses generic fallback method
@btime [a c ; [b d]] # 21 allocations and 880 bytes
@btime [[a c] ; [b d]] # 16 allocations and 816 bytes -- explicit hcat nested within vcat
# scalars wrapped in arrays
@btime [a c ; b [d]] # 10 allocations and 512 bytes -- uses as::AbstractArray{T}... method
@btime [a c ; [b [d]]] # 9 allocations and 560 bytes
@btime [[a c] ; [b [d]]] # 4 allocations and 496 bytes -- explicit hcat nested within vcat

In theory hvcat should always be more efficient than nesting vcats and hcats, but not in this case. And I don't think there's a good reason a scalar should behave differently from an 1-element array in hvcat.

I'll try to work on this, but open to ideas.

(Also worth noting that the first 3 are quite a bit worse on 1.5.3, while the last three are the same.)

BioTurboNick commented 2 years ago

Checking back in on this with 1.8.0-rc4 performance:

@btime [a c ; b d]
  2.244 μs (30 allocations: 1.33 KiB)

@btime [a c ; [b d]]
  1.950 μs (21 allocations: 816 bytes)

@btime [[a c] ; [b d]]
  955.556 ns (16 allocations: 736 bytes)

@btime [a c ; b [d]]
  1.280 μs (10 allocations: 448 bytes)

@btime [a c ; [b [d]]]
  1.080 μs (9 allocations: 464 bytes)

@btime [[a c] ; [b [d]]]
  122.060 ns (4 allocations: 384 bytes)