queryverse / Queryverse.jl

A meta package for data science in julia
Other
151 stars 13 forks source link

Helps to reorder packages, saves almost two sec. #43

Open PallHaraldsson opened 4 years ago

PallHaraldsson commented 4 years ago

I'm not sure it's known, but it seems reordering files, matters a lot (but end-result should be the same?):

$ julia -q -O0
julia> @time using CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
  7.033658 seconds (14.33 M allocations: 857.918 MiB, 5.21% gc time)

$ julia -q -O0
julia> @time using VegaLite, DataFrames, ParquetFiles, StatFiles, ExcelFiles, CSVFiles
  8.916748 seconds (18.03 M allocations: 1.045 GiB, 5.08% gc time)

$ julia -q -O0
julia> @time using VegaLite, DataFrames, ParquetFiles, StatFiles, ExcelFiles, CSVFiles
  8.845061 seconds (18.03 M allocations: 1.045 GiB, 5.09% gc time)

$ julia -q -O0
julia> @time using VegaLite, DataFrames, ParquetFiles, StatFiles, ExcelFiles, CSVFiles
  8.816272 seconds (18.03 M allocations: 1.045 GiB, 5.07% gc time)

l$ julia -q -O0
julia> @time using VegaLite, DataFrames, ParquetFiles, StatFiles, ExcelFiles, CSVFiles
  9.025282 seconds (18.03 M allocations: 1.045 GiB, 4.99% gc time)
PallHaraldsson commented 4 years ago

[For my version I'm deving, I have @time a two places for debug.]

Very strange, I've never before seen precompiling faster than the (then) precompiled code, by even almost 3 sec.:

julia> @time @time using Queryverse
[ Info: Precompiling Queryverse [612083be-0b0f-5412-89c1-4e7c75506a58]
  6.007613 seconds (11.89 M allocations: 712.061 MiB, 5.70% gc time)
  0.125396 seconds (203.22 k allocations: 14.252 MiB)
  8.077601 seconds (819.60 k allocations: 49.927 MiB, 0.38% gc time)
  8.098104 seconds (819.89 k allocations: 49.944 MiB, 0.38% gc time)

julia> 
pharaldsson_sym@SYMLINUX011:~/nowcasting/eagdpsmall_jl$ julia -q
julia> @time @time using Queryverse
 10.788762 seconds (15.31 M allocations: 924.085 MiB, 3.68% gc time)
 10.820495 seconds (15.34 M allocations: 926.251 MiB, 3.67% gc time)

julia> @time @time using Queryverse
  0.045575 seconds (31.53 k allocations: 1.784 MiB)
  0.045721 seconds (31.57 k allocations: 1.786 MiB)

julia> 
pharaldsson_sym@SYMLINUX011:~/nowcasting/eagdpsmall_jl$ julia -q
julia> @time @time using Queryverse
 10.610409 seconds (15.31 M allocations: 924.081 MiB, 3.66% gc time)
 10.641014 seconds (15.34 M allocations: 926.247 MiB, 3.65% gc time)

Probably because I did first in the same session:

$ julia -q
julia> using Reexport; @time @reexport using CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
 10.263021 seconds (14.32 M allocations: 857.337 MiB, 3.59% gc time)

shell> vi ~/.julia/dev/Queryverse/src/Queryverse.jl

where I believe I edited the file, and thought would invalidae the precompile, why it happened, apparently, not all of it.

PallHaraldsson commented 4 years ago

More strangeness, adding one more package (DataValues), makes using faster (and have fewer allocations):

$ julia -q -O0
julia> @time using DataValues, CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
  6.862661 seconds (14.07 M allocations: 842.613 MiB, 4.93% gc time)

$ julia -q -O0
julia> @time using DataValues, CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
  6.817513 seconds (14.07 M allocations: 842.667 MiB, 4.95% gc time)

$ julia -q -O0
julia> @time using CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
  7.072484 seconds (14.33 M allocations: 857.858 MiB, 4.91% gc time)

Also when using reexport:

$ julia -q -O0
julia> using Reexport; @time @reexport using CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
  7.116889 seconds (14.32 M allocations: 857.302 MiB, 5.15% gc time)

$ julia -q -O0
julia> using Reexport; @time @reexport using CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
  7.138370 seconds (14.32 M allocations: 857.345 MiB, 5.41% gc time)

$ julia -q -O0
julia> using Reexport; @time @reexport using CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
  7.294075 seconds (14.32 M allocations: 857.356 MiB, 5.42% gc time)

$ julia -q -O0
julia> using Reexport; @time @reexport using DataValues, CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
  6.826384 seconds (14.07 M allocations: 842.087 MiB, 5.39% gc time)

$ julia -q -O0
julia> using Reexport; @time @reexport using DataValues, CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
  6.999893 seconds (14.07 M allocations: 842.102 MiB, 5.28% gc time)

$ julia -q -O0
julia> using Reexport; @time @reexport using CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
  7.203445 seconds (14.32 M allocations: 857.352 MiB, 5.34% gc time)

$ julia -q -O0
julia> using Reexport; @time @reexport using CSVFiles, ExcelFiles, StatFiles, ParquetFiles, DataFrames, VegaLite
  7.086503 seconds (14.32 M allocations: 857.347 MiB, 5.46% gc time)