Closed alecloudenback closed 4 years ago
This is actually much slower :/
Old way:
@benchmark MortalityTables.parseXTbMLTable(MortalityTables.getXML(MortalityTables.open_and_read("src/tables/SOA/t3344.xml")),"t3344.xml")
BenchmarkTools.Trial:
memory estimate: 9.02 MiB
allocs estimate: 213310
--------------
minimum time: 20.437 ms (0.00% GC)
median time: 21.689 ms (0.00% GC)
mean time: 26.732 ms (6.65% GC)
maximum time: 61.448 ms (18.10% GC)
--------------
samples: 187
evals/sample: 1
vs EzXML way:
julia> @benchmark eztbl("src/tables/SOA/t3344.xml")
BenchmarkTools.Trial:
memory estimate: 37.21 MiB
allocs estimate: 745073
--------------
minimum time: 1.093 s (0.00% GC)
median time: 1.097 s (0.00% GC)
mean time: 1.097 s (0.00% GC)
maximum time: 1.104 s (0.00% GC)
--------------
samples: 5
evals/sample: 1
CSV does indeed look like the better way to go:
@benchmark CSV.File("src/tables/SOA/t3344.csv",skipto=25,footerskip=117)
BenchmarkTools.Trial:
memory estimate: 71.12 KiB
allocs estimate: 413
--------------
minimum time: 325.501 μs (0.00% GC)
median time: 334.399 μs (0.00% GC)
mean time: 354.782 μs (1.41% GC)
maximum time: 11.254 ms (42.08% GC)
--------------
samples: 10000
evals/sample: 1
The simplifications being worked on in #55 haven't reduced the load times:
@benchmark MortalityTables.parseXTbMLTable(MortalityTables.getXML(MortalityTables.open_and_read("src/tables/SOA/t3344.xml")),"t3344.xml")
BenchmarkTools.Trial:
memory estimate: 8.93 MiB
allocs estimate: 212258
--------------
minimum time: 20.807 ms (0.00% GC)
median time: 21.637 ms (0.00% GC)
mean time: 25.016 ms (5.51% GC)
maximum time: 42.924 ms (17.30% GC)
--------------
samples: 200
evals/sample: 1
Plan: Replace XMLDict with simplified parsing of XML to avoid expensive conversion to Dict as intermediate step.
Current Status:
extbl
for hard part of traversing table and getting rates