WIP: Xtbml optimizations

JuliaActuary / MortalityTables.jl

Easily Reference and use Actuarial Mortality Tables

https://juliaactuary.github.io/MortalityTables.jl/stable

Other

28 stars 10 forks source link

WIP: Xtbml optimizations #41

Closed alecloudenback closed 4 years ago

alecloudenback commented 4 years ago

Plan: Replace XMLDict with simplified parsing of XML to avoid expensive conversion to Dict as intermediate step.

Current Status:

See extbl for hard part of traversing table and getting rates
- need to also return metadata info
Then load returned tuples of values into Mortality Table strucutres

alecloudenback commented 4 years ago

This is actually much slower :/

Old way:

@benchmark MortalityTables.parseXTbMLTable(MortalityTables.getXML(MortalityTables.open_and_read("src/tables/SOA/t3344.xml")),"t3344.xml")
BenchmarkTools.Trial: 
  memory estimate:  9.02 MiB
  allocs estimate:  213310
  --------------
  minimum time:     20.437 ms (0.00% GC)
  median time:      21.689 ms (0.00% GC)
  mean time:        26.732 ms (6.65% GC)
  maximum time:     61.448 ms (18.10% GC)
  --------------
  samples:          187
  evals/sample:     1

vs EzXML way:

julia> @benchmark eztbl("src/tables/SOA/t3344.xml")
BenchmarkTools.Trial: 
  memory estimate:  37.21 MiB
  allocs estimate:  745073
  --------------
  minimum time:     1.093 s (0.00% GC)
  median time:      1.097 s (0.00% GC)
  mean time:        1.097 s (0.00% GC)
  maximum time:     1.104 s (0.00% GC)
  --------------
  samples:          5
  evals/sample:     1

alecloudenback commented 4 years ago

CSV does indeed look like the better way to go:

@benchmark CSV.File("src/tables/SOA/t3344.csv",skipto=25,footerskip=117)
BenchmarkTools.Trial: 
  memory estimate:  71.12 KiB
  allocs estimate:  413
  --------------
  minimum time:     325.501 μs (0.00% GC)
  median time:      334.399 μs (0.00% GC)
  mean time:        354.782 μs (1.41% GC)
  maximum time:     11.254 ms (42.08% GC)
  --------------
  samples:          10000
  evals/sample:     1

alecloudenback commented 4 years ago

The simplifications being worked on in #55 haven't reduced the load times:

@benchmark MortalityTables.parseXTbMLTable(MortalityTables.getXML(MortalityTables.open_and_read("src/tables/SOA/t3344.xml")),"t3344.xml")
BenchmarkTools.Trial: 
  memory estimate:  8.93 MiB
  allocs estimate:  212258
  --------------
  minimum time:     20.807 ms (0.00% GC)
  median time:      21.637 ms (0.00% GC)
  mean time:        25.016 ms (5.51% GC)
  maximum time:     42.924 ms (17.30% GC)
  --------------
  samples:          200
  evals/sample:     1