Closed nickrobinson251 closed 1 month ago
~/r/TimeZones.jl> TZDATA_VERSION=2016j julia --project=benchmark/ -e 'using PkgBenchmark, TimeZones; export_markdown(stdout, judge(TimeZones, "origin/HEAD", verbose=false))'
PkgBenchmark: Running benchmarks...
[ Info: Installing 2016j tzdata region data
[ Info: Converting tz source files into TimeZone data
PkgBenchmark: creating benchmark tuning file /Users/nickr/repos/TimeZones.jl/benchmark/tune.json...
Tuning 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:36
Benchmarking 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:43
PkgBenchmark: Running benchmarks...
[ Info: Installing 2016j tzdata region data
[ Info: Converting tz source files into TimeZone data
PkgBenchmark: using benchmark tuning data in /Users/nickr/repos/TimeZones.jl/benchmark/tune.json
Benchmarking 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:48
A ratio greater than 1.0
denotes a possible regression (marked with :x:), while a ratio less
than 1.0
denotes a possible improvement (marked with :white_check_mark:). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).
ID | time ratio | memory ratio |
---|---|---|
["ZonedDateTime", "local", "standard"] |
0.84 (5%) :white_check_mark: | 1.00 (1%) |
["ZonedDateTime", "range", "VariableTimeZone/DatePeriod"] |
0.91 (5%) :white_check_mark: | 1.00 (1%) |
["arithmetic", "DatePeriod"] |
0.93 (5%) :white_check_mark: | 1.00 (1%) |
["arithmetic", "TimePeriod"] |
1.09 (5%) :x: | 1.00 (1%) |
["arithmetic", "broadcast", "FixedTimeZone/TimePeriod"] |
0.94 (5%) :white_check_mark: | 1.00 (1%) |
["arithmetic", "broadcast", "VariableTimeZone/DatePeriod"] |
0.93 (5%) :white_check_mark: | 1.00 (1%) |
["interpret", "local", "ambiguous"] |
0.87 (5%) :white_check_mark: | 1.00 (1%) |
["interpret", "local", "standard"] |
0.89 (5%) :white_check_mark: | 1.00 (1%) |
["parse", "ISOZonedDateTimeFormat"] |
0.82 (5%) :white_check_mark: | 0.87 (1%) :white_check_mark: |
["parse", "issue#25"] |
0.92 (5%) :white_check_mark: | 0.88 (1%) :white_check_mark: |
["transition_range", "local", "ambiguous"] |
0.83 (5%) :white_check_mark: | 1.00 (1%) |
["transition_range", "local", "non-existent"] |
0.79 (5%) :white_check_mark: | 1.00 (1%) |
["transition_range", "local", "standard"] |
0.81 (5%) :white_check_mark: | 1.00 (1%) |
["transition_range", "utc"] |
1.43 (5%) :x: | 1.00 (1%) |
["tryparsenext_fixedtz", "+06"] |
0.94 (5%) :white_check_mark: | 1.00 (1%) |
["tryparsenext_fixedtz", "+0600"] |
0.94 (5%) :white_check_mark: | 1.00 (1%) |
["tryparsenext_fixedtz", "+06:00"] |
0.95 (5%) :white_check_mark: | 1.00 (1%) |
["tryparsenext_fixedtz", "-06"] |
0.93 (5%) :white_check_mark: | 1.00 (1%) |
["tryparsenext_fixedtz", "-0600"] |
0.95 (5%) :white_check_mark: | 1.00 (1%) |
["tryparsenext_fixedtz", "UTC"] |
0.73 (5%) :white_check_mark: | 1.00 (1%) |
["tryparsenext_fixedtz", "Z"] |
0.56 (5%) :white_check_mark: | 1.00 (1%) |
["tz_data", "parse_components"] |
0.89 (5%) :white_check_mark: | 0.84 (1%) :white_check_mark: |
Here's a list of all the benchmark groups executed by this job:
["ZonedDateTime"]
["ZonedDateTime", "local"]
["ZonedDateTime", "range"]
["arithmetic"]
["arithmetic", "broadcast"]
["interpret", "local"]
["interpret"]
["parse"]
["transition_range", "local"]
["transition_range"]
["tryparsenext_fixedtz"]
["tryparsenext_tz"]
["tz_data"]
Julia Version 1.8.2
Commit 36034abf260 (2022-09-29 15:21 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin21.4.0)
uname: Darwin 21.4.0 Darwin Kernel Version 21.4.0: Mon Feb 21 20:35:58 PST 2022; root:xnu-8020.101.4~2/RELEASE_ARM64_T6000 x86_64 i386
CPU: Apple M1 Max:
speed user nice sys idle irq
#1-10 2400 MHz 1580681 s 0 s 1469385 s 30832641 s 0 s
Memory: 64.0 GB (2506.71875 MB free)
Uptime: 446789.0 sec
Load Avg: 2.73779296875 2.6376953125 2.58837890625
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, westmere)
Threads: 1 on 10 virtual cores
Julia Version 1.8.2
Commit 36034abf260 (2022-09-29 15:21 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin21.4.0)
uname: Darwin 21.4.0 Darwin Kernel Version 21.4.0: Mon Feb 21 20:35:58 PST 2022; root:xnu-8020.101.4~2/RELEASE_ARM64_T6000 x86_64 i386
CPU: Apple M1 Max:
speed user nice sys idle irq
#1-10 2400 MHz 1581658 s 0 s 1469753 s 30837878 s 0 s
Memory: 64.0 GB (2527.17578125 MB free)
Uptime: 446854.0 sec
Load Avg: 2.205078125 2.5078125 2.54345703125
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, westmere)
Threads: 1 on 10 virtual cores
Re-ran the benchmarks and the ones that are ❌ above are :white_check_mark: on re-run (and others are now not significant), so i think they're very noisy? Especially as most of these don't look like they even hit the codepaths changed in this PR?
Anyway, from this doesn't seem like we can say the two caches cause any performance issue
I'm not sure i understand why the VariableTimeZones still alloc when you look them up. I agree with you it is unexpected.. I'm not sure the isbits thing should matter, since we're pointing to the existing vector, not allocating a new one, and Julia is supposed to be able to stack-allocate an immutable struct that contains mutable fields, since like 1.4 or something. So i'm not sure i understand yet; it might be good to look through a profile together, once the rest of the issues in the PR are addressed.
Okay, i've spent a couple days playing around with different options but i can't yet see how to remove the remaining VariableTimeZones allocation, so i propose leaving that as a follow-up. And likewise i don't think a switch to MultiThreadedCaches.jl is straight forward, so would prefer to get this perf improvement in and leave that refactor to a follow-up.
@NHDaly and @omus please could you take another look at this PR? Thanks!
I'll try to take a look soon. I'll call out that I also want to get #382 merged soon as well.
Merging #423 (3218a4e) into master (804864c) will decrease coverage by
0.47%
. The diff coverage is98.41%
.:exclamation: Current head 3218a4e differs from pull request most recent head b158178. Consider uploading reports for the commit b158178 to get more accurate results
:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more
@@ Coverage Diff @@
## master #423 +/- ##
==========================================
- Coverage 95.58% 95.11% -0.47%
==========================================
Files 38 36 -2
Lines 1766 1803 +37
==========================================
+ Hits 1688 1715 +27
- Misses 78 88 +10
Impacted Files | Coverage Δ | |
---|---|---|
src/TimeZones.jl | 100.00% <ø> (ø) |
|
src/types/timezone.jl | 98.64% <98.36%> (+1.50%) |
:arrow_up: |
src/tzdata/compile.jl | 94.78% <100.00%> (-0.97%) |
:arrow_down: |
I've been a bit swamped this week. Overall, I'm in favour of this change and will definitely merge this before #382
Just following up to make sure this doesn't drop off your radar. Let me know if i can do anything to help, @omus :)
It turned out #382 became urgent to merge for Julia 1.9 so the original plan had changed. If @nickrobinson251 you have time to rebase this against the latest version of the code that would be great. If not, I'll try to adapt this code myself sometime.
Is this just a (simple) rebase away? It seems good as is, down to 0 allocations is great, already an improvement. Getting rid of the 1 remaining alloc for VariableTimeZone would be ideal, but shouldn't stop from merging as is? I didn't look closely into why it's there, maybe fixed already on 1.10?
I think this is just a (not-very-simple) rebase away. And I would love, love, love someone to take that on, as I don't have capacity at the moment -- please feel encouraged to do so!
Superseded by #451
Even though we're careful about caching the creation of TimeZones and made
FixedTimeZone
anisbitstype
/allocatedinline
, we still allocate everytimeTimeZone
is called, even if just returning aFixedTimeZone
from the cache!I think this happens because the values in the cache are of type
TimeZone
, i.e. it's not known if we're returning aFixedTimeZone
or aVariableTimeZone
.This PR proposes to instead maintain separate caches (per thread) for
FixedTimeZone
s andVariableTimeZone
s, rather than a single one (per thread) for allTimeZone
sEffectively this is some code duplication for a performance improvement.
VariableTimeZone
still allocates once, presumably since it's notisbitstype
(#271).Micro benchmarks
FixedTimeZone
This PR
master
(v1.9.1)VariableTimeZone
This PR
master
(v1.9.1)