JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.9k stars 5.49k forks source link

Performance issue with typed_hvcat #56292

Open DNF2 opened 1 month ago

DNF2 commented 1 month ago

As discussed in this Discourse thread: https://discourse.julialang.org/t/strange-performance-of-literal-array-constructor/121575/1 performance is off for typed_hvcat.

julia> foo() = Int[1 2 3; 4 5 6; 7 8 9]
foo (generic function with 3 methods)

julia> bar() = [1 2 3; 4 5 6; 7 8 9]
bar (generic function with 1 method)

julia> @btime foo();
  209.524 ns (3 allocations: 224 bytes)

julia> @btime bar();
  36.052 ns (2 allocations: 144 bytes)

Performance is regained with @inline:

julia> foo_i() = @inline Int[1 2 3; 4 5 6; 7 8 9]
foo_i (generic function with 1 method)

julia> @btime foo_i();
  32.663 ns (2 allocations: 144 bytes)
oscardssmith commented 1 month ago

It looks like the thing that pushes it over the inlining threshold is the (unnecessary) check

    if length(a) != length(xs)
        throw(ArgumentError("argument count does not match specified shape (expected $(length(a)), got $(length(xs)))"))
    end

This check is unnecessary since if the length doesn't match, one of the rows won't match, but it still adds 40 to the inlining cost. Interestingly, https://github.com/JuliaLang/julia/pull/55913 also solves this since it makes the array construction better understood (and therefore cheaper for the inliner)