I was digging through the dependencies of Dynare looking for low hanging fruit to improve our ttfx.
this PR seems to save 1.2s for one of our dependencies PATHsolver because they use DataDeps in their __init__.
here is a small sample test and the timing results of the first run in a fresh session.
module testMod
using DataDeps
function test()
ENV["DATADEPS_ALWAYS_ACCEPT"] = "true"
register(DataDep("name", "msg", "www.example.com", Any))
DataDeps.datadep"name"
end
end
on Master:
julia> include("testMod.jl")
julia> @time @eval testMod.test()
1.700212 seconds (2.80 M allocations: 180.028 MiB, 7.15% gc time, 99.44% compilation time)
with the first commit (removing type specification):
julia> include("testMod.jl")
julia> @time @eval testMod.test()
1.015055 seconds (1.56 M allocations: 100.482 MiB, 9.25% gc time, 99.05% compilation time)
with the both commits:
julia> include("testMod.jl")
julia> @time @eval testMod.test()
0.504765 seconds (551.64 k allocations: 36.326 MiB, 15.40% gc time, 99.93% compilation time)
and for sake of completeness, with a simple PercompileTools block containing the same 3 lines:
But I'm not sure if it is ok to have code that runs on the first using DataDeps that hits the network and writes to disk.
Note: I'm using @time @eval because @time seems to ignore time that was spent on things like inference. @time @eval closely matches the actual wall-time.
Getting rid of the type parameters makese sense to me.
Compiling optimized code when data download is going to utterly dwarf anything else is too much.
I was digging through the dependencies of Dynare looking for low hanging fruit to improve our ttfx.
this PR seems to save 1.2s for one of our dependencies PATHsolver because they use DataDeps in their
__init__
.here is a small sample test and the timing results of the first run in a fresh session.
on Master:
with the first commit (removing type specification):
with the both commits:
and for sake of completeness, with a simple PercompileTools block containing the same 3 lines:
But I'm not sure if it is ok to have code that runs on the first
using DataDeps
that hits the network and writes to disk.Note: I'm using
@time @eval
because@time
seems to ignore time that was spent on things like inference.@time @eval
closely matches the actual wall-time.