`resize!` is O(N) when downsizing.

JuliaLang / julia

The Julia Programming Language

https://julialang.org/

MIT License

45.75k stars 5.49k forks source link

`resize!` is O(N) when downsizing. #56359

Closed sgaure closed 2 weeks ago

sgaure commented 2 weeks ago

From https://discourse.julialang.org/t/why-does-a-vector-with-10-times-more-elements-takes-2x-5x-less-time-to-pre-allocate/121828/7?u=sgaure

This isn't as it should be. Downsizing a bits vector shouldn't be dependent on the size?

using Plots

function f(n)
    a = Vector{Int}(undef, n)
    s = time_ns()
    resize!(a, 8)
    time_ns() - s
end

x = 8:10:1000000
y = f.(x)

plot(x, y)
versioninfo()

Julia Version 1.11.1
Commit 8f5b7ca12ad (2024-10-16 10:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 24 × AMD Ryzen Threadripper PRO 5945WX 12-Cores
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 24 default, 0 interactive, 12 GC (on 24 virtual cores)

vtjnash commented 2 weeks ago

It looks like LLVM may have decided it was profitable to vectorize that for loop instead, since it doesn't contain any instructions that would hinder that :joy: (literally, it doesn't contain any instructions and could be removed). See @code_llvm raw=true Base._deleteend!(Int[], 0)

Aside: the resize! function, particularly when used repeatedly, is intended to override the normal heuristics in order to allow the use to guarantee O(n) behavior instead of the default O(1) behavior

KristofferC commented 2 weeks ago

Accessing a mutable field etc in

https://github.com/JuliaLang/julia/blob/2cdfe062952c3a1168da7545a10bfa0ec205b4db/base/array.jl#L219

is probably enough to not allow the loop at

https://github.com/JuliaLang/julia/blob/2cdfe062952c3a1168da7545a10bfa0ec205b4db/base/array.jl#L1242-L1244

to fold away. Could that loop just be guarded with an !isbitstype check?

Edit: Although:

julia> f(a,i) = @inbounds Base._unsetindex!(a, i)
f (generic function with 1 method)

julia> @code_llvm f(rand(10),1)
; Function Signature: f(Array{Float64, 1}, Int64)
;  @ REPL[1]:1 within `f`
define nonnull ptr @julia_f_1021(ptr noundef nonnull align 8 dereferenceable(24) %"a::Array", i64 signext %"i::Int64") #0 {
top:
; ┌ @ array.jl:220 within `_unsetindex!`
   ret ptr %"a::Array"
; └
}

KristofferC commented 2 weeks ago

Aside: the resize! function, particularly when used repeatedly, is intended to override the normal heuristics in order to allow the use to guarantee O(n) behavior instead of the default O(1) behavior

I thought that was sizehint! if I understand what you are referring to.

vtjnash commented 2 weeks ago

Ah, maybe that is sizehint!. I assumed resize called that, but maybe not.

But yes, it is just the loop that remains, nothing from the function

oscardssmith commented 2 weeks ago

I think the only reason this doesn't get optimized is that we don't know that

│   %7 = $(Expr(:gc_preserve_begin, :(%6)))
│        $(Expr(:gc_preserve_end, :(%7)))

with nothing in between can be deleted (or alternatively, the problem is that we gc_preserve even in the bitstype case)

oscardssmith commented 2 weeks ago

fixed by https://github.com/JuliaLang/julia/pull/56364

oscardssmith commented 2 weeks ago

@gbaraldi could the allocopt pass get smarter so the original version of this gets solved?

gbaraldi commented 2 weeks ago

Potentially yes