KristofferC / TimerOutputs.jl

Formatted output of timed sections in Julia
Other
659 stars 54 forks source link

RFC: Add Tables.jl interface #136

Open tkf opened 3 years ago

tkf commented 3 years ago

This is an RFC PR for adding Tables.jl interface for the TimerOutput type.

It would be nice if TimerOutputs.jl supports exporting to a common format like Tables.jl. We can then further analyze and transform the timing data in common data analysis tools like DataFrames.jl.

KristofferC commented 3 years ago

Seems like a good idea. Any impact on time to load the package?

tkf commented 3 years ago

With TimerOutputs v0.5.12

julia> for x in 1:5
           run(`julia -e '@time using TimerOutputs'`)
       end
  0.032261 seconds (62.16 k allocations: 5.446 MiB, 10.44% compilation time)
  0.031950 seconds (62.16 k allocations: 5.448 MiB, 10.53% compilation time)
  0.032072 seconds (62.16 k allocations: 5.446 MiB, 10.61% compilation time)
  0.032025 seconds (62.16 k allocations: 5.446 MiB, 10.46% compilation time)
  0.032081 seconds (62.16 k allocations: 5.446 MiB, 10.42% compilation time)

With this branch

julia> for x in 1:5
           run(`julia --project -e '@time using TimerOutputs'`)
       end
  0.051612 seconds (89.50 k allocations: 7.356 MiB, 6.49% compilation time)
  0.051275 seconds (89.50 k allocations: 7.354 MiB, 6.52% compilation time)
  0.051299 seconds (89.50 k allocations: 7.354 MiB, 6.60% compilation time)
  0.051392 seconds (89.50 k allocations: 7.354 MiB, 6.54% compilation time)
  0.051374 seconds (89.50 k allocations: 7.354 MiB, 6.53% compilation time)

So, yes, there is extra 20 ms, from just using Tables (but no apparent "multiplicative effect" due to method table merging)

julia> for x in 1:5
           run(`julia --project -e '@time using Tables'`)
       end
  0.022166 seconds (33.17 k allocations: 2.526 MiB, 15.14% compilation time)
  0.022211 seconds (33.17 k allocations: 2.526 MiB, 15.12% compilation time)
  0.022256 seconds (33.17 k allocations: 2.526 MiB, 15.10% compilation time)
  0.022220 seconds (33.17 k allocations: 2.526 MiB, 15.05% compilation time)
  0.022201 seconds (33.17 k allocations: 2.526 MiB, 15.02% compilation time)
KristofferC commented 3 years ago

Sounds good. Something needs to be added to the README to make people aware of the feature though.

ericphanson commented 3 years ago

Serialization of TimerOutputs seems really useful! I wonder if a nested format like JSON would map better to the structure of timers, though, rather than as flat tables?

e.g., adapting this PR's code as

function timer_namedtuple(to::TimerOutput, name="")
           ks = sort!(collect(keys(to.inner_timers)))
           return (;
               name,
               ncalls=TimerOutputs.ncalls(to),
               time=TimerOutputs.time(to),
               allocated=TimerOutputs.allocated(to),
               totallocated=TimerOutputs.totallocated(to),
               tottime=TimerOutputs.tottime(to),
               inner_timers=[timer_namedtuple(to.inner_timers[k], k) for k in ks]
           )
 end

we get

julia> to = TimerOutput()
 ──────────────────────────────────────────────────────────────────
                           Time                   Allocations      
                   ──────────────────────   ───────────────────────
 Tot / % measured:      174μs / 0.00%           2.31KiB / 0.00%    

 Section   ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────
 ──────────────────────────────────────────────────────────────────

julia> time_test() # example from the REDME

julia> time_test()

julia> to
 ──────────────────────────────────────────────────────────────────────
                               Time                   Allocations      
                       ──────────────────────   ───────────────────────
   Tot / % measured:        9.51s / 23.1%            263KiB / 2.25%    

 Section       ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────────
 nest 2             2    1.41s  64.0%   703ms   2.47KiB  41.8%  1.23KiB
   level 2.2        2    803ms  36.5%   402ms      352B  5.82%     176B
   level 2.1        2    603ms  27.4%   301ms      352B  5.82%     176B
 nest 1             2    792ms  36.0%   396ms   3.44KiB  58.2%  1.72KiB
   level 2.2        2    403ms  18.3%   201ms      352B  5.82%     176B
   level 2.1        6    187ms  8.50%  31.2ms   1.03KiB  17.5%     176B
 ──────────────────────────────────────────────────────────────────────

julia> JSON.json(timer_namedtuple(to))
"{\"name\":\"\",\"ncalls\":0,\"time\":0,\"allocated\":0,\"totallocated\":6048,\"tottime\":2198635310,\"inner_timers\":[{\"name\":\"nest 1\",\"ncalls\":2,\"time\":792498634,\"allocated\":3520,\"totallocated\":1408,\"tottime\":589575624,\"inner_timers\":[{\"name\":\"level 2.1\",\"ncalls\":6,\"time\":186935800,\"allocated\":1056,\"totallocated\":0,\"tottime\":0,\"inner_timers\":[]},{\"name\":\"level 2.2\",\"ncalls\":2,\"time\":402639824,\"allocated\":352,\"totallocated\":0,\"tottime\":0,\"inner_timers\":[]}]},{\"name\":\"nest 2\",\"ncalls\":2,\"time\":1406136676,\"allocated\":2528,\"totallocated\":704,\"tottime\":1405990730,\"inner_timers\":[{\"name\":\"level 2.1\",\"ncalls\":2,\"time\":602971951,\"allocated\":352,\"totallocated\":0,\"tottime\":0,\"inner_timers\":[]},{\"name\":\"level 2.2\",\"ncalls\":2,\"time\":803018779,\"allocated\":352,\"totallocated\":0,\"tottime\":0,\"inner_timers\":[]}]}]}"

edit:

I think a nicer version of this is to use StructTypes,

using StructTypes

StructTypes.StructType(::Type{TimerOutput}) = StructTypes.DictType()
StructTypes.keyvaluepairs(to::TimerOutput) = pairs(timer_dict(to))

function timer_dict(to::TimerOutput, name="")
    ks = sort!(collect(keys(to.inner_timers)))
    return Dict{String,Any}(
        "name" => name,
        "ncalls"=>TimerOutputs.ncalls(to),
        "time"=>TimerOutputs.time(to),
        "allocated"=>TimerOutputs.allocated(to),
        "totallocated"=>TimerOutputs.totallocated(to),
        "tottime"=>TimerOutputs.tottime(to),
        "inner_timers"=>[timer_dict(to.inner_timers[k], k) for k in ks]
    )
end

which gives

julia> JSON3.write(to)
"{\"name\":\"\",\"time\":0,\"ncalls\":0,\"totallocated\":21776,\"tottime\":2213375273,\"allocated\":0,\"inner_timers\":[{\"name\":\"nest 1\",\"time\":807252644,\"ncalls\":2,\"totallocated\":10160,\"tottime\":597909970,\"allocated\":18912,\"inner_timers\":[{\"name\":\"level 2.1\",\"time\":195278704,\"ncalls\":6,\"totallocated\":0,\"tottime\":0,\"allocated\":9472,\"inner_timers\":[]},{\"name\":\"level 2.2\",\"time\":402631266,\"ncalls\":2,\"totallocated\":0,\"tottime\":0,\"allocated\":688,\"inner_timers\":[]}]},{\"name\":\"nest 2\",\"time\":1406122629,\"ncalls\":2,\"totallocated\":1040,\"tottime\":1405968298,\"allocated\":2864,\"inner_timers\":[{\"name\":\"level 2.1\",\"time\":602844704,\"ncalls\":2,\"totallocated\":0,\"tottime\":0,\"allocated\":352,\"inner_timers\":[]},{\"name\":\"level 2.2\",\"time\":803123594,\"ncalls\":2,\"totallocated\":0,\"tottime\":0,\"allocated\":688,\"inner_timers\":[]}]}]}"

This is actually entirely orthogonal to a tabular interface; TimerOutputs could support both.

tkf commented 3 years ago

I'm thinking this PR is more for analysis and transformation than serialization. I think serialization would be nice to have but I think it's better to do it in a separate PR (to define the StructTypes.jl protocol).