Closed iangow closed 4 years ago
@Yvonne-Han I'm pretty sure we've solved this one before somewhere else. It's hard to keep track of all these things!
In any case, I think dtype = {'last_update': DateTime(timezone=True)
might be the approach that @bdcallen has used.
Sys.setenv(PGHOST = "10.101.13.99", PGDATABASE="crsp")
library(dplyr, warn.conflicts = FALSE)
library(DBI)
pg <- dbConnect(RPostgres::Postgres())
rs <- dbExecute(pg, "SET search_path TO streetevents, public")
timestamp_test <- tbl(pg, "timestamp_test")
timestamp_test_alt <- tbl(pg, "timestamp_test_alt")
calls <- tbl(pg, "calls")
total <-
timestamp_test %>%
count() %>%
pull()
matched <-
timestamp_test %>%
semi_join(calls) %>%
count() %>%
pull()
#> Joining, by = c("file_name", "last_update")
sprintf("Test passed: %s", total==matched)
#> [1] "Test passed: TRUE"
total_alt <-
timestamp_test_alt %>%
count() %>%
pull()
matched_alt <-
timestamp_test_alt %>%
semi_join(calls) %>%
count() %>%
pull()
#> Joining, by = c("file_name", "last_update")
sprintf("Alt test passed: %s", total_alt==matched_alt)
#> [1] "Alt test passed: FALSE"
temp <-
timestamp_test %>%
select(last_update) %>%
collect(n=10) %>%
pull()
temp_alt <-
timestamp_test_alt %>%
select(last_update) %>%
collect(n=10) %>%
pull()
# Sys.timezone()
Sys.setenv(TZ="America/New_York")
temp
#> [1] "2005-07-29 15:37:34 EDT" "2005-10-27 16:32:58 EDT"
#> [3] "2005-11-04 07:25:42 EST" "2005-07-21 14:41:18 EDT"
#> [5] "2005-03-08 13:42:52 EST" "2005-10-20 17:12:32 EDT"
#> [7] "2006-01-23 16:03:54 EST" "2005-10-05 13:38:21 EDT"
#> [9] "2005-02-02 13:43:45 EST" "2005-03-07 15:26:47 EST"
temp_alt
#> [1] "2005-07-30 01:37:34 EDT" "2005-10-28 02:32:58 EDT"
#> [3] "2005-11-04 18:25:42 EST" "2005-07-22 00:41:18 EDT"
#> [5] "2005-03-09 00:42:52 EST" "2005-10-21 03:12:32 EDT"
#> [7] "2006-01-24 03:03:54 EST" "2005-10-05 23:38:21 EDT"
#> [9] "2005-02-03 00:43:45 EST" "2005-03-08 02:26:47 EST"
Sys.setenv(TZ="Australia/Sydney")
temp
#> [1] "2005-07-30 05:37:34 AEST" "2005-10-28 06:32:58 AEST"
#> [3] "2005-11-04 23:25:42 AEDT" "2005-07-22 04:41:18 AEST"
#> [5] "2005-03-09 05:42:52 AEDT" "2005-10-21 07:12:32 AEST"
#> [7] "2006-01-24 08:03:54 AEDT" "2005-10-06 03:38:21 AEST"
#> [9] "2005-02-03 05:43:45 AEDT" "2005-03-08 07:26:47 AEDT"
temp_alt
#> [1] "2005-07-30 15:37:34 AEST" "2005-10-28 16:32:58 AEST"
#> [3] "2005-11-05 10:25:42 AEDT" "2005-07-22 14:41:18 AEST"
#> [5] "2005-03-09 16:42:52 AEDT" "2005-10-21 17:12:32 AEST"
#> [7] "2006-01-24 19:03:54 AEDT" "2005-10-06 13:38:21 AEST"
#> [9] "2005-02-03 16:43:45 AEDT" "2005-03-08 18:26:47 AEDT"
Sys.setenv(TZ="UTC")
temp
#> [1] "2005-07-29 19:37:34 UTC" "2005-10-27 20:32:58 UTC"
#> [3] "2005-11-04 12:25:42 UTC" "2005-07-21 18:41:18 UTC"
#> [5] "2005-03-08 18:42:52 UTC" "2005-10-20 21:12:32 UTC"
#> [7] "2006-01-23 21:03:54 UTC" "2005-10-05 17:38:21 UTC"
#> [9] "2005-02-02 18:43:45 UTC" "2005-03-07 20:26:47 UTC"
temp_alt
#> [1] "2005-07-30 05:37:34 UTC" "2005-10-28 06:32:58 UTC"
#> [3] "2005-11-04 23:25:42 UTC" "2005-07-22 04:41:18 UTC"
#> [5] "2005-03-09 05:42:52 UTC" "2005-10-21 07:12:32 UTC"
#> [7] "2006-01-24 08:03:54 UTC" "2005-10-06 03:38:21 UTC"
#> [9] "2005-02-03 05:43:45 UTC" "2005-03-08 07:26:47 UTC"
Created on 2020-04-16 by the reprex package (v0.3.0)
@iangow Yes, I know how hard it is to keep track of everything. Sometimes I confuse myself too... (I shouldn't because it's just been ≤ a year). Probably you want to try Quiver?
I just went through your reprex. So does that mean I can use dtype = {'last_update': DateTime(timezone=True)}
in your NER code now?
@Yvonne-Han
@iangow Yes, I know how hard it is to keep track of everything. Sometimes I confuse myself too... (I shouldn't because it's just been ≤ a year). Probably you want to try Quiver?
I just went through your reprex. So does that mean I can use
dtype = {'last_update': DateTime(timezone=True)}
in your NER code now?
I think so.
I just ran the code from here against the MCCGR server see updated comment above (the earlier code was running against a PostgreSQL server downstairs from me). From the changes, you can see that nothing changes when I use type = {'last_update': DateTime(timezone=True)}
, but everything falls over with type = {'last_update': DateTime(timezone=False)}
.
The problem with remembering the solution is that I have remember who did it, which project, etc. I have some stuff in gist.github.com/iangow
.
Code I ran (where tz_test.py
contains the code here):
igowmbp15:~ igow$ export PGHOST=10.101.13.99
igowmbp15:~ igow$ python3 tz_test.py
igowmbp15:~ igow$ # Ran R code above here.
igowmbp15:~ igow$ psql
psql (12.2, server 11.5 (Ubuntu 11.5-1.pgdg18.04+1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.
crsp=# DROP TABLE timestamp_test;
DROP TABLE
crsp=# DROP TABLE timestamp_test_alt;
DROP TABLE
crsp=# \q
igowmbp15:~ igow$
Builds on discussion here.
Trying to fix code for preserving
timestamp with time zone
:file_name
withlast_update
from the server