Decode TIMESTAMPTZ using connection's time zone

jgaskins commented 1 year ago

This is an idea for how #265 could be implemented, mostly implemented on my lunch break today. 🙃

Closes #265

jgaskins commented 1 year ago

One notable thing about this PR is that I didn't do anything with SET TIME ZONE or SHOW TIME ZONE. It's all being set in memory on the connection object. No idea if that's a good thing or a bad thing.

jgaskins commented 1 year ago

That's probably a good idea. I deal mainly with ephemeral infrastructure and sometimes forget that the environment can change around a running program. 🤔

jgaskins commented 1 year ago

I was trying to see what the performance penalty of decoding the TSTZ with the Time::Location is and it looks like it's about ~20µs of CPU time for 1k TSTZ values, or 20ns each. Here's the benchmark I used, where the only difference is whether it executes time.in location (the time calculation itself is identical between TS and TSTZ), fetching 1k time values from the DB on each query to amortize the latency cost:

Benchmark code

```crystal require "benchmark" require "../src/pg" db = DB.open("postgres:///") Benchmark.ips do |x| x.report "timestamptz" { db.query_all "SELECT '2023-03-17 15:30 America/New_York'::timestamptz FROM generate_series(1, 1000)", as: Time } x.report "timestamp" { db.query_all "SELECT '2023-03-17 15:30 America/New_York'::timestamp FROM generate_series(1, 1000)", as: Time } end # This monkeypatch is taken from https://github.com/crystal-lang/crystal/pull/9846 # to measure CPU time used per iteration. module Benchmark # Benchmark IPS calculates the number of iterations per second for a given # block of code. The strategy is to use two stages: a warmup stage and a # calculation stage. # # The warmup phase defaults to 2 seconds. During this stage we figure out how # many cycles are needed to run the block for roughly 100ms, and record it. # # The calculation defaults to 5 seconds. During this stage we run the block # in sets of the size calculated in the warmup stage. The measurements for # those sets are then used to calculate the mean and standard deviation, # which are then reported. Additionally we compare the means to that of the # fastest. module IPS class Job # List of all entries in the benchmark. # After #execute, these are populated with the resulting statistics. property items : Array(Entry) @warmup_time : Time::Span @calculation_time : Time::Span def initialize(calculation = 5, warmup = 2, interactive = STDOUT.tty?) @interactive = !!interactive @warmup_time = warmup.seconds @calculation_time = calculation.seconds @items = [] of Entry end # Adds code to be benchmarked def report(label = "", &action) item = Entry.new(label, action) @items << item item end def execute run_warmup run_calculation run_comparison end def report max_label = ran_items.max_of &.label.size max_compare = ran_items.max_of &.human_compare.size max_bytes_per_op = ran_items.max_of &.bytes_per_op.humanize(base: 1024).size ran_items.each do |item| printf "%s %s (%s real, %s user, %s sys) (±%5.2f%%) %sB/op %s\n", item.label.rjust(max_label), item.human_mean, item.human_iteration_time, item.human_cpu_user_time, item.human_cpu_system_time, item.relative_stddev, item.bytes_per_op.humanize(base: 1024).rjust(max_bytes_per_op), item.human_compare.rjust(max_compare) end end # The warmup stage gathers information about the items that is later used # in the calculation stage private def run_warmup @items.each do |item| GC.collect count = 0 elapsed = Time.measure do target = Time.monotonic + @warmup_time while Time.monotonic < target item.call count += 1 end end item.set_cycles(elapsed, count) end end private def run_calculation @items.each do |item| GC.collect measurements = [] of Time::Span bytes = 0_i64 cycles = 0_i64 cpu_user_seconds = 0.0_f64 cpu_system_seconds = 0.0_f64 target = Time.monotonic + @calculation_time loop do elapsed = nil before_cpu_time = nil after_cpu_time = nil bytes_taken = Benchmark.memory do before_cpu_time = Process.times elapsed = Time.measure { item.call_for_100ms } after_cpu_time = Process.times cpu_user_seconds += (after_cpu_time.utime + after_cpu_time.cutime) - (before_cpu_time.utime + before_cpu_time.cutime) cpu_system_seconds += (after_cpu_time.stime + after_cpu_time.cstime) - (before_cpu_time.stime + before_cpu_time.cstime) end bytes += bytes_taken cycles += item.cycles measurements << elapsed.not_nil! break if Time.monotonic >= target end ips = measurements.map { |m| item.cycles.to_f / m.total_seconds } item.calculate_stats(ips) item.bytes_per_op = (bytes.to_f / cycles.to_f).round.to_u64 item.cpu_user_seconds = cpu_user_seconds / cycles item.cpu_system_seconds = cpu_system_seconds / cycles if @interactive run_comparison report print "\e[#{ran_items.size}A" end end end private def ran_items @items.select(&.ran?) end private def run_comparison fastest = ran_items.max_by { |i| i.mean } ran_items.each do |item| item.slower = (fastest.mean / item.mean).to_f end end end class Entry # Label of the benchmark property label : String # Code to be benchmarked property action : -> # Number of cycles needed to run for approx 100ms # Calculated during the warmup stage property! cycles : Int32 # Number of 100ms runs during the calculation stage property! size : Int32 # Statistical mean from calculation stage property! mean : Float64 # Statistical variance from calculation stage property! variance : Float64 # Statistical standard deviation from calculation stage property! stddev : Float64 # Relative standard deviation as a percentage property! relative_stddev : Float64 # Multiple slower than the fastest entry property! slower : Float64 # Number of bytes allocated per operation property! bytes_per_op : UInt64 # Number of seconds the CPU spent in executing userland code property! cpu_user_seconds : Float64 # Number of seconds the CPU spent in system calls property! cpu_system_seconds : Float64 @ran : Bool @ran = false def initialize(@label : String, @action : ->) end def ran? @ran end def call action.call end def call_for_100ms cycles.times { action.call } end def set_cycles(duration, iterations) @cycles = (iterations / duration.total_milliseconds * 100).to_i @cycles = 1 if cycles <= 0 end def calculate_stats(samples) @ran = true @size = samples.size @mean = samples.sum.to_f / size.to_f @variance = (samples.reduce(0) { |acc, i| acc + ((i - mean) ** 2) }).to_f / size.to_f @stddev = Math.sqrt(variance) @relative_stddev = 100.0 * (stddev / mean) end def human_mean mean.humanize(precision: 2, significant: false, prefixes: Number::SI_PREFIXES_PADDED).rjust(7) end def human_iteration_time iteration_time = 1.0 / mean iteration_time.humanize(precision: 2, significant: false) do |magnitude, _| magnitude = Number.prefix_index(magnitude).clamp(-9..0) {magnitude, magnitude == 0 ? "s " : "#{Number.si_prefix(magnitude)}s"} end.rjust(8) end def human_cpu_user_time "#{cpu_user_seconds.humanize(precision: 2, significant: false)}s".rjust(8) end def human_cpu_system_time "#{cpu_system_seconds.humanize(precision: 2, significant: false)}s".rjust(8) end def human_compare if slower == 1.0 "fastest" else sprintf "%5.2f× slower", slower end end end end end ```

➜  crystal-pg git:(decode-tstz-with-timezone) crystal run --release examples/bench_time_zone_check.cr
timestamptz   5.81k (172.01µs real,  77.47µs user,   5.83µs sys) (± 0.81%)  58.7kB/op   1.11× slower
  timestamp   6.47k (154.62µs real,  55.48µs user,   7.01µs sys) (± 1.83%)  58.7kB/op        fastest

will commented 1 year ago

Just got a chance actually dive in and explore this, thanks @jgaskins! I think everything makes sense. I'll add the cache clear method that @straight-shoota suggested and a changelog right now.

will / crystal-pg

Decode TIMESTAMPTZ using connection's time zone #266