Closed PMassicotte closed 6 years ago
This is likely due to the timezone attribute on the date object and your local timezone setting.
readr always writes datetimes in ISO8601 format with a UTC timezone, so if your datetime object is not actually in UTC it will be converted to that format before writing. Running attributes(df[[2]])
should tell you the timezone of the original data.
However your example is not reproducible, I do not have the original dataset.
@jimhester is there a way to attach binary data to an issue? Meanwhile, I updated my question with your suggestion.
The tzone attribute of ""
means that it uses whatever timezone your R session set to use.
However as I said readr always writes files explicitly with UTC timezone, so your data is being converted to UTC before being written. It is then read as the converted UTC timezone.
See the difference between x
and y
here. x
is using the system timezone (in my case eastern US timezone), y
is explicitly set to UTC.
Sys.timezone()
#> [1] "America/New_York"
x <- as.POSIXct("2015-04-03 21:26:01 UTC")
format(x, tz = "UTC")
#> [1] "2015-04-04 01:26:01"
attr(x, "tzone")
#> [1] ""
y <- strptime("2015-04-03 21:26:01", format = "%Y-%m-%d %H:%M:%S", tz = "UTC")
format(y, tz = "UTC")
#> [1] "2015-04-03 21:26:01"
attr(y, "tzone")
#> [1] "UTC"
Thank you @jimhester. Is it not a bit dangerous that this happens silently? Is this documented, I could not find information with a quick Google search.
In ?write_csv
it states
POSIXct's are formatted as ISO8601
Perhaps that should be extended to
POSIXct's are formatted as ISO8601 in UTC timezone
Also if you look at the data being written to the file you will see it looks like '2015-04-04T01:26:01Z', the Z indicates a zero timezone offset, e.g. UTC timezone.
The issue here really is however you got the data into R originally. While the column name claims these are UTC datetimes they are actually local times.
Thank you very much @jimhester for your time. You can close this issue if you think no further action are needed. Regards, Phil
Thank you for opening the issue, I have added additional text to clarify what happens if you supply write functions with non-UTC datetime objects.
this was a helpful discussion, i ran into the same problem just now. thanks!
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/
I am facing a strange behaviour when saving data containing date with
write_csv()
. The following example shows that the date change after saving the data withwrite_csv
.Here the first date is set to 2015-04-03 21:26:01
If I save the data using
write_csv
and re-read it, we see that the date time has changed. 2015-04-03 21:26:01 -> 2015-04-04 01:26:01The problem is avoided when using
write.csv
instead ofwrite_csv
. 2015-04-03 21:26:01 stays 2015-04-03 21:26:01.