JuliaTime / TimeZones.jl

IANA time zone database access for the Julia programming language
Other
86 stars 51 forks source link

Minimize allocations when unpacking `TimeZone`s from cache (updated) #451

Closed tpgillam closed 1 month ago

tpgillam commented 7 months ago

Closes https://github.com/JuliaTime/TimeZones.jl/issues/422

This is a rebase/reimplementation of #423 onto master . (FYI @nickrobinson251, I'm doing this after seeing your comment!)

I started by trying to rebase your PR properly, but in the end it was easier to cherry-pick your commits that still applied (just the tests), then re-implement the rest. I'm sorry that this results in this new PR, but wasn't sure how else to proceed.

One note -- I find that it now takes two allocations to retrieve a VariableTimeZone from the cache, rather than 1. I spent some time with the alllocation tracker trying to understand this with no luck, so I've updated the test expectation accordingly. This isn't any worse than the current behaviour on master, and in any case the main thing is that retrieving a FixedTimeZone is now allocation free :)

tpgillam commented 7 months ago

Also, I've checked this against julia 1.10-rc1, and still see 2 allocs for constructing a VariableTimeZone unfortunately.

tpgillam commented 7 months ago

Also, I've checked this against julia 1.10-rc1, and still see 2 allocs for constructing a VariableTimeZone unfortunately.

I tried an alternative refactor, where I had a separate cache for all the classes (taking us to a total of 3) -- see on this branch. Whilst this reduces the number of allocations (down to 1 again for the VariableTimeZone), overall the timings are significantly worse, presumably due to the additional hash-table lookup.

Microbenchmarks on my machine with Julia 1.10-rc1 for

@btime TimeZone("UTC")  # FTZ case
@btime TimeZone("America/Winnipeg")  # VTZ case

this PR

FTZ:  18.727 ns (0 allocations: 0 bytes)
VTZ:  37.680 ns (2 allocations: 96 bytes)

separate class cache

FTZ:  31.096 ns (0 allocations: 0 bytes)
VTZ:  46.151 ns (1 allocation: 48 bytes)

current master

FTZ:  48.142 ns (2 allocations: 64 bytes)
VTZ:  48.238 ns (2 allocations: 64 bytes)
tpgillam commented 7 months ago

LGTM. I want to sit with this for a little bit

Thanks - let me know if there’s anything else from my end that would be helpful.

tpgillam commented 7 months ago
PkgBenchmark ``` > TZDATA_VERSION=2016j julia --project=benchmark/ -e 'using PkgBenchmark, TimeZones; export_markdown(stdout, judge(TimeZones, "origin/HEAD", verbose=false))' PkgBenchmark: Running benchmarks... PkgBenchmark: using benchmark tuning data in /home/tom/.julia/dev/TimeZones/benchmark/tune.json Benchmarking 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:45 PkgBenchmark: Running benchmarks... PkgBenchmark: using benchmark tuning data in /home/tom/.julia/dev/TimeZones/benchmark/tune.json Benchmarking 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:47 ``` # Benchmark Report for */home/tom/.julia/dev/TimeZones* ## Job Properties * Time of benchmarks: - Target: 30 Nov 2023 - 21:39 - Baseline: 30 Nov 2023 - 21:40 * Package commits: - Target: 2b0b76 - Baseline: f9d0dc * Julia commits: - Target: 8e5136 - Baseline: 8e5136 * Julia command flags: - Target: None - Baseline: None * Environment variables: - Target: None - Baseline: None ## Results A ratio greater than `1.0` denotes a possible regression (marked with :x:), while a ratio less than `1.0` denotes a possible improvement (marked with :white_check_mark:). Only significant results - results that indicate possible regressions or improvements - are shown below (thus, an empty table means that all benchmark results remained invariant between builds). | ID | time ratio | memory ratio | |--------------------------------------------------------------|------------------------------|------------------------------| | `["ZonedDateTime", "local", "ambiguous"]` | 1.08 (5%) :x: | 1.00 (1%) | | `["ZonedDateTime", "local", "standard"]` | 0.92 (5%) :white_check_mark: | 1.00 (1%) | | `["ZonedDateTime", "range", "VariableTimeZone/DatePeriod"]` | 1.09 (5%) :x: | 1.00 (1%) | | `["interpret", "local", "ambiguous"]` | 0.92 (5%) :white_check_mark: | 1.00 (1%) | | `["interpret", "local", "non-existent"]` | 0.91 (5%) :white_check_mark: | 1.00 (1%) | | `["interpret", "local", "standard"]` | 0.94 (5%) :white_check_mark: | 1.00 (1%) | | `["interpret", "utc"]` | 0.90 (5%) :white_check_mark: | 1.00 (1%) | | `["parse", "ISOZonedDateTimeFormat"]` | 1.00 (5%) | 0.94 (1%) :white_check_mark: | | `["parse", "issue#25"]` | 1.01 (5%) | 1.05 (1%) :x: | | `["transition_range", "utc"]` | 1.88 (5%) :x: | 1.00 (1%) | | `["tryparsenext_tz", "America/Argentina/ComodRivadavia"]` | 0.95 (5%) :white_check_mark: | 1.00 (1%) | | `["tryparsenext_tz", "UTC"]` | 0.94 (5%) :white_check_mark: | 1.00 (1%) | | `["tz_data", "parse_components"]` | 0.98 (5%) | 1.07 (1%) :x: | ## Benchmark Group List Here's a list of all the benchmark groups executed by this job: - `["ZonedDateTime"]` - `["ZonedDateTime", "local"]` - `["ZonedDateTime", "range"]` - `["arithmetic"]` - `["arithmetic", "broadcast"]` - `["interpret", "local"]` - `["interpret"]` - `["parse"]` - `["transition_range", "local"]` - `["transition_range"]` - `["tryparsenext_fixedtz"]` - `["tryparsenext_tz"]` - `["tz_data"]` ## Julia versioninfo ### Target ``` Julia Version 1.9.4 Commit 8e5136fa297 (2023-11-14 08:46 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: Linux (x86_64-linux-gnu) "Fedora release 39 (Thirty Nine)" uname: Linux 6.6.2-201.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Nov 22 21:31:42 UTC 2023 x86_64 unknown CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz: speed user nice sys idle irq #1-12 4298 MHz 290496 s 1686 s 90879 s 443666 s 20652 s Memory: 31.26819610595703 GB (19133.734375 MB free) Uptime: 46609.8 sec Load Avg: 1.26 0.93 0.72 WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-14.0.6 (ORCJIT, skylake) Threads: 1 on 12 virtual cores ``` ### Baseline ``` Julia Version 1.9.4 Commit 8e5136fa297 (2023-11-14 08:46 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: Linux (x86_64-linux-gnu) "Fedora release 39 (Thirty Nine)" uname: Linux 6.6.2-201.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Nov 22 21:31:42 UTC 2023 x86_64 unknown CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz: speed user nice sys idle irq #1-12 4497 MHz 291253 s 1686 s 90977 s 450010 s 20669 s Memory: 31.26819610595703 GB (19001.28125 MB free) Uptime: 46670.1 sec Load Avg: 1.58 1.06 0.77 WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-14.0.6 (ORCJIT, skylake) Threads: 1 on 12 virtual cores ``` For good measure I did a second run, and stuff moved around quite a bit. So I'm not reading too much into any of these numbers: | ID | time ratio | memory ratio | |--------------------------------------------------------------|------------------------------|------------------------------| | `["ZonedDateTime", "local", "ambiguous"]` | 0.91 (5%) :white_check_mark: | 1.00 (1%) | | `["interpret", "local", "ambiguous"]` | 1.29 (5%) :x: | 1.00 (1%) | | `["interpret", "local", "non-existent"]` | 1.37 (5%) :x: | 1.00 (1%) | | `["interpret", "local", "standard"]` | 1.21 (5%) :x: | 1.00 (1%) | | `["parse", "ISOZonedDateTimeFormat"]` | 1.01 (5%) | 0.94 (1%) :white_check_mark: | | `["parse", "issue#25"]` | 0.97 (5%) | 1.05 (1%) :x: | | `["transition_range", "utc"]` | 0.75 (5%) :white_check_mark: | 1.00 (1%) | | `["tryparsenext_fixedtz", "UTC"]` | 1.09 (5%) :x: | 1.00 (1%) | | `["tz_data", "parse_components"]` | 0.99 (5%) | 1.07 (1%) :x: |
tpgillam commented 7 months ago

Thanks both for the discussions so far! @omus are you happy to move forward with this, or are there are other thoughts/concerns we should discuss?

nickrobinson251 commented 3 months ago

Bump! @omus please could you merge this and make a new release?

omus commented 1 month ago

Baseline timings for my machine. I'll need to resolve the conflict yet.

Julia 1.10.0

Current master (ac06a91)

julia> @btime TimeZone("America/Winnipeg");
  41.246 ns (2 allocations: 64 bytes)

julia> @btime TimeZone("UTC");
  40.868 ns (2 allocations: 64 bytes)

This PR (2b0b76e)

julia> @btime TimeZone("UTC");
  17.994 ns (0 allocations: 0 bytes)

julia> @btime TimeZone("America/Winnipeg");
  28.475 ns (2 allocations: 96 bytes)
codecov-commenter commented 1 month ago

Codecov Report

Attention: Patch coverage is 97.22222% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 92.73%. Comparing base (2bc8f50) to head (1237173). Report is 7 commits behind head on master.

Files Patch % Lines
src/types/timezone.jl 97.22% 1 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #451 +/- ## ========================================== - Coverage 92.79% 92.73% -0.07% ========================================== Files 39 39 Lines 1818 1844 +26 ========================================== + Hits 1687 1710 +23 - Misses 131 134 +3 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

omus commented 1 month ago

This PR 511d2cc

julia> @time_imports using TimeZones
      0.4 ms  Scratch
     10.2 ms  Preferences
      0.4 ms  PrecompileTools
               ┌ 0.0 ms Parsers.__init__()
     45.3 ms  Parsers 40.31% compilation time
      4.2 ms  InlineStrings
      0.4 ms  TZJData
      0.5 ms  Compat
      0.2 ms  Compat → CompatLinearAlgebraExt
      0.4 ms  ExprTools
      0.6 ms  Mocking
               ┌ 1.1 ms TimeZones.TZData.__init__()
               ├ 0.0 ms TimeZones.__init__()
     22.6 ms  TimeZones

julia> using BenchmarkTools

julia> @btime TimeZone("UTC");
  18.203 ns (0 allocations: 0 bytes)

julia> @btime TimeZone("America/Winnipeg");
  29.104 ns (2 allocations: 96 bytes)

julia> @btime istimezone("Europe/Warsaw");
  76.418 ns (1 allocation: 48 bytes)

I slight increase in allocated bytes with istimezone from this: https://github.com/JuliaTime/TimeZones.jl/pull/457#issuecomment-2118428992

omus commented 1 month ago

The cache code is definitely becoming unwieldy. I've created a TimeZoneCache struct which doesn't affect the performance:

This PR 67f2c3f


julia> @time_imports using TimeZones
      0.4 ms  Scratch
      9.1 ms  Preferences
      0.4 ms  PrecompileTools
               ┌ 0.0 ms Parsers.__init__()
     46.7 ms  Parsers 38.23% compilation time
      4.1 ms  InlineStrings
      0.3 ms  TZJData
      0.4 ms  Compat
      0.2 ms  Compat → CompatLinearAlgebraExt
      0.4 ms  ExprTools
      0.6 ms  Mocking
               ┌ 1.0 ms TimeZones.TZData.__init__()
               ├ 0.0 ms TimeZones.__init__()
     22.9 ms  TimeZones

julia> using BenchmarkTools

julia> @btime TimeZone("UTC");
  18.597 ns (0 allocations: 0 bytes)

julia> @btime TimeZone("America/Winnipeg");
  28.894 ns (2 allocations: 96 bytes)

julia> @btime istimezone("Europe/Warsaw");
  74.802 ns (1 allocation: 48 bytes)
omus commented 1 month ago

These changes are also a step in the right direction for reducing the initial cache loading time

This PR 67f2c3f

julia> @btime TimeZones._reload_tz_cache(TimeZones._COMPILED_DIR[]);
  20.795 ms (313574 allocations: 12.52 MiB)

Current master ac06a91

julia> @btime TimeZones._reload_tz_cache(TimeZones._COMPILED_DIR[]);
  22.808 ms (313621 allocations: 12.52 MiB)
omus commented 1 month ago

I'm planning on rolling some other changes in before making a release here. I expect TimeZones.jl 1.17 should be out by 2024-05-27 (Monday).