Evaluate performance traps

randyzwitch commented 4 years ago

For load_table_binary_columnar and parsing Thrift to dataframe, evaluate any performance gotchas and improve. For example, pre-allocating array sizes for load table might improve performance.

randyzwitch commented 4 years ago

Per https://randyzwitch.com/benchmarktools-julia-benchmarking/, might just make sense to start directly with threading

randyzwitch commented 4 years ago

Seeing these levels of performance with OmniScil.jl v0.9.0

#100k chunks, 1 thread
julia> @benchmark load_table_binary_columnar($conn, "carmotion", $df)
BenchmarkTools.Trial: 
  memory estimate:  6.48 GiB
  allocs estimate:  280468596
  --------------
  minimum time:     233.413 s (0.55% GC)
  median time:      233.413 s (0.55% GC)
  mean time:        233.413 s (0.55% GC)
  maximum time:     233.413 s (0.55% GC)
  --------------
  samples:          1
  evals/sample:     1

julia> @benchmark load_table($conn, "carmotion", $df)
BenchmarkTools.Trial: 
  memory estimate:  76.67 GiB
  allocs estimate:  1715304092
  --------------
  minimum time:     1123.754 s (4.18% GC)
  median time:      1123.754 s (4.18% GC)
  mean time:        1123.754 s (4.18% GC)
  maximum time:     1123.754 s (4.18% GC)
  --------------
  samples:          1
  evals/sample:     1

Comparing with omnisql (which is row-wise):

omnisql> \timing
omnisql> copy carmotion from '/home/rzwitch/gtc_1mm.csv';
Result
Loaded: 1000000 recs, Rejected: 0 recs in 1.014000 secs
1 rows returned.
Execution time: 1015 ms, Total time: 1022 ms
omnisql> copy carmotion from '/home/rzwitch/gtc_1mm.csv' with (threads=1);
Result
Loaded: 1000000 recs, Rejected: 0 recs in 4.308000 secs
1 rows returned.
Execution time: 4308 ms, Total time: 4308 ms

randyzwitch commented 4 years ago

Needs fixing before implementing threading: https://github.com/JuliaMath/DecFP.jl/issues/39

heavyai / heavyai.jl

Evaluate performance traps #87