Closed butterflyology closed 5 years ago
Can you please provide a minimal example? I'm not likely to install 40+ packages in order to run your example. I'm also curious whether the issue exists when all those other packages aren't loaded and/or attached.
It looks like there may be an issue with the time-of-day subsetting, since regularlized
shouldn't have any columns.
Also, it's bad practice to call methods (e.g. merge.xts()
) explicitly. There's no guarantee of what it will do if you call merge.xts()
with non-xts objects.
Thanks for the reply. You will note that I loaded two packages, tidyverse
(which is admittedly rather large and cumbersome) and xts
. I guess I thought that two packages was closer to minimal than loading my entire workflow but I can report that if you just load readr
the same result exists.
I'll adjust the merge.xts
code to reflect best practices. Thanks again.
tidyverse
is a meta-package that loads 40+ other packages. xts only loads
itself and zoo.
I think you might be trying to merge the xts with just the index of regularized. Also, try changing the join argument to outer/inner/left/right depending on what you want.
dim(merge.xts(PotC_xts, index(regularized)))
dim(merge.xts(PotC_xts, index(regularized), join='inner'))
dim(merge.xts(PotC_xts, index(regularized), join='right'))
Thank you, @harvey131. If I understand you correctly I still get an error. When I enter the following code:
merged <- merge(PotC_xts, index(regularized), join = "right")
I do get the correct dimensions 399009 1
but there is a time shift (the first index in the xts
object is shifted to 16:00:00
hours and the data are all filled in with NA
s. The shift in time happens regardless if I "trim" the indices to only operating hours.
If you use join = 'right' it will produce an NA if the time doesnt exist in the 'x' argument PotC_xts. Are you trying to get the prevailing value if the 'y' time doesn't exist in 'x'?
As a separate issue, maybe merge.xts should have documentation added that 'y' can be a vector of POSIXct.
> merged <- merge(PotC_xts, index(regularized), join = "right")
> head(merged, n = 15)
TIME
2012-01-01 13:00:00 NA
2012-01-01 13:05:00 NA
2012-01-01 13:10:00 NA
2012-01-01 13:15:00 NA
2012-01-01 13:20:00 NA
2012-01-01 13:25:00 NA
2012-01-01 13:30:00 NA
2012-01-01 13:35:00 NA
2012-01-01 13:40:00 NA
2012-01-01 13:45:00 NA
2012-01-01 13:50:00 NA
2012-01-01 13:55:00 10
2012-01-01 14:00:00 NA
2012-01-01 14:05:00 NA
2012-01-01 14:10:00 NA
# in this example the timestamps are printed as GMT
> format(index(merged)[1], tz='GMT')
[1] "2012-01-01 13:00:00"
Thanks @harvey131.
I think that I am doing a poor job of explaining, thank you for your patience.
Your example looks spot on and just what I want, but I am not getting the same result. I use all of the code in the example above and then change my merge
to match yours:
merged <- merge(PotC_xts, index(regularized), join = "right")
head(merged, n = 15)
TIME
2012-01-01 16:00:00 NA
2012-01-01 16:05:00 NA
2012-01-01 16:10:00 NA
2012-01-01 16:15:00 NA
2012-01-01 16:20:00 NA
2012-01-01 16:25:00 NA
2012-01-01 16:30:00 NA
2012-01-01 16:35:00 NA
2012-01-01 16:40:00 NA
2012-01-01 16:45:00 NA
2012-01-01 16:50:00 NA
2012-01-01 16:55:00 NA
2012-01-01 17:00:00 NA
2012-01-01 17:05:00 NA
2012-01-01 17:10:00 NA
Warning message:
timezone of object (UTC) is different than current timezone ().
@butterflyology I think your different result is because the index of PotC_xts
and/or regularized
has a UTC timezone in your last comment. No timezone was specified in the examples in your other comments, so they were using your local time (US/Pacific).
Maybe your subsequent runs added a non-local timezone to one of the objects?
For what it's worth, here's my minimal reproducible example:
library("xts")
PotC <- read.csv("https://raw.githubusercontent.com/butterflyology/TouringPlans_data/master/data/pirates_of_caribbean.csv", as.is = TRUE)
PotC_xts <- xts(PotC$SPOSTMIN, as.POSIXct(PotC$datetime))
colnames(PotC_xts) <- "TIME"
PotC_xts <- align.time(PotC_xts, n = 5 * 60)
# create a zero-width xts object
regularized <- xts(, seq(from = as.POSIXct("2012-01-01 08:00:00"),
to = as.POSIXct("2018-06-18 22:00:00"), by = "5 min"))
regularized <- regularized["T08:00/T22:00"] # constrict to "business hours"
merged <- merge(PotC_xts, regularized)
@harvey131 Thanks for your help with this confusion! I think your suggestion to use join = "right"
is correct. Regarding y
being POSIXct
, I would prefer that behavior remain undocumented. I don't particularly like it, since it's easy to construct an empty xts object instead.
Thanks you, @joshuaulrich and @harvey131.
I'm think that the lubridate
or something else in the tidyverse
is introducing an error into the mix. It seems to assume all dates/times to be UTC regardless of what we tell xts
.
@butterflyology, do you think it's safe to say this isn't a bug in xts? Please note that I've investigated the behavior that subsetting a zero-width xts object returns an object with a column of NA
and found that it is consistent with zoo... and therefore not a bug.
@joshuaulrich I think that you are correct. Thanks for your assistance.
Thanks @butterflyology, I appreciate that you followed-up!
Description
I want to merge two
xts
objects: one a zero-length object and the second the one that contains the data I want to regularize. When I investigate the merged object many of the indices from the zero-length object are gone and values are shifted forward to odd times. =Expected behavior
I expect the merged object to contain all of the indices of the
regularized
object and fill inNA
s with data if there is a corresponding value, and otherwise leaveNA
s in place if no datum with the same index exists.Minimal, reproducible example
Session Info