JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.65k stars 5.48k forks source link

Julia v1.11.1 regression: `open(::Cmd)` is 4X slower than v1.10.4 #56352

Open cihga39871 opened 3 hours ago

cihga39871 commented 3 hours ago

Hi,

I found Base.Process created by open(x::Cmd) is extremely slower than v1.10.4.

Minimal example in Linux:

First create a large file:

file = joinpath(Sys.BINDIR, "../include/julia/uv.h")
large_file = "z.large.file"

for i in 1:1000
    run(pipeline(`cat $file`, stdout = large_file, append=true))
end

Then, run the following command in Julia v1.11.1 and v1.10.4:

large_file = "z.large.file"

c = 0
io = open(`cat $large_file`)

@time while !eof(io)
    line = readline(io)
    c += length(line)
end

close(io)

Here is the differences:

I also wrote a package to solve the regression, which is BiBufferedStreams.jl

Its readme has a thorough benchmark, eliminating the effects of precompilation.

PallHaraldsson commented 2 hours ago

After compilation, and in a function not global scope): 3.737725 seconds (11.06 M allocations: 483.446 MiB, 3.83% gc time, 0.00% compilation time) # why showing "0.0%"? I think non-zero rounded down.

vs.

0.846987 seconds (6.42 M allocations: 424.382 MiB, 2.08% gc time, 0.57% compilation time)

The cause might be more allocations (5.48 on average per line vs 3.17 in 1.10.5 still only 1.9 on average per file if I calculated correctly)

julia> @time line = readline(io)
  0.000018 seconds (6 allocations: 288 bytes)
" * Permission is hereby granted, free of charge, to any person obtaining a copy"

vs fewer allocations in 1.10:

julia> @time line = readline(io)
  0.000010 seconds (4 allocations: 536 bytes)
" * Permission is hereby granted, free of charge, to any person obtaining a copy"

it seems to be always 1 or 2 more allocations, but actually similar speed. Maybe the speed is miscalculated, or the penalty somehow deferred, adding GC pressure later? No, 99% of non-GC related for the loop:

3.800862 seconds (13.08 M allocations: 514.214 MiB, 1.00% gc time)

It seems to also have a fixed compilation overhead, e.g. mainly for:

julia> @time io = open(`cat $large_file`)
  0.144038 seconds (109.00 k allocations: 5.722 MiB, 99.33% compilation time)

Some precompilation needed or helpful, but least a small fraction of the total time unless I missed something.

vtjnash commented 2 hours ago

Do you know what caused the slow down? I notice your read implementations don't acquire the required locks that the other functions use, so they are somewhat dangerous to call those functions (unlike the default implementations), though most typical uses are probably quite fine as the locks may be often unnecessary for single threaded use