Open cihga39871 opened 3 hours ago
After compilation, and in a function not global scope): 3.737725 seconds (11.06 M allocations: 483.446 MiB, 3.83% gc time, 0.00% compilation time) # why showing "0.0%"? I think non-zero rounded down.
vs.
0.846987 seconds (6.42 M allocations: 424.382 MiB, 2.08% gc time, 0.57% compilation time)
The cause might be more allocations (5.48 on average per line vs 3.17 in 1.10.5 still only 1.9 on average per file if I calculated correctly)
julia> @time line = readline(io)
0.000018 seconds (6 allocations: 288 bytes)
" * Permission is hereby granted, free of charge, to any person obtaining a copy"
vs fewer allocations in 1.10:
julia> @time line = readline(io)
0.000010 seconds (4 allocations: 536 bytes)
" * Permission is hereby granted, free of charge, to any person obtaining a copy"
it seems to be always 1 or 2 more allocations, but actually similar speed. Maybe the speed is miscalculated, or the penalty somehow deferred, adding GC pressure later? No, 99% of non-GC related for the loop:
3.800862 seconds (13.08 M allocations: 514.214 MiB, 1.00% gc time)
It seems to also have a fixed compilation overhead, e.g. mainly for:
julia> @time io = open(`cat $large_file`)
0.144038 seconds (109.00 k allocations: 5.722 MiB, 99.33% compilation time)
Some precompilation needed or helpful, but least a small fraction of the total time unless I missed something.
Do you know what caused the slow down? I notice your read implementations don't acquire the required locks that the other functions use, so they are somewhat dangerous to call those functions (unlike the default implementations), though most typical uses are probably quite fine as the locks may be often unnecessary for single threaded use
Hi,
I found
Base.Process
created byopen(x::Cmd)
is extremely slower than v1.10.4.Minimal example in Linux:
First create a large file:
Then, run the following command in Julia v1.11.1 and v1.10.4:
Here is the differences:
2.123290 seconds (12.57 M allocations: 496.226 MiB, 5.34% gc time, 2.38% compilation time)
0.579683 seconds (8.07 M allocations: 440.373 MiB, 3.59% gc time, 3.00% compilation time)
I also wrote a package to solve the regression, which is BiBufferedStreams.jl
Its readme has a thorough benchmark, eliminating the effects of precompilation.