joshuaulrich / xts

Extensible time series class that provides uniform handling of many R time series classes by extending zoo.
http://joshuaulrich.github.io/xts/
GNU General Public License v2.0
220 stars 71 forks source link

Ops inexplicably renames column #351

Closed minimenchmuncher closed 1 year ago

minimenchmuncher commented 3 years ago

Description

When multiplying two xts objects together, whose column names were both "0", and the indices were not equal, the result had a different column name, "X0"

Expected behavior

I would expect the column name to stay "0", even though the indexes are different. This would make it consistent with non-numeric-able (ie strings that don't represent numbers) column names.

Minimal, reproducible example

## THIS WORKS AS EXPECTED (equal indices, numeric column names)
a1 <- xts(matrix(c(1,2,3), dimnames = list(NULL, "0")), order.by = as.yearmon(c(2020, 2021, 2022)))
a2 <- xts(matrix(c(4,5,6), dimnames = list(NULL, "0")), order.by = as.yearmon(c(2020, 2021, 2022)))
a2 * a1
#          0
#Jan 2020  4
#Jan 2021 10
#Jan 2022 18

## THIS ALSO WORKS AS EXPECTED (character column names)
a1 <- xts(matrix(c(1,2,3), dimnames = list(NULL, "abc")), order.by = as.yearmon(c(2020, 2021, 2022)))
a2 <- xts(matrix(c(4,5,6), dimnames = list(NULL, "abc")), order.by = as.yearmon(c(2021, 2022, 2023)))
a1 * a2
#         abc
#Jan 2021   8
#Jan 2022  15

## THIS HOWEVER DOES NOT (different indices, numeric column names)
a1 <- xts(matrix(c(1,2,3), dimnames = list(NULL, "0")), order.by = as.yearmon(c(2020, 2021, 2022)))
a2 <- xts(matrix(c(4,5,6), dimnames = list(NULL, "0")), order.by = as.yearmon(c(2021, 2022, 2023)))
a1 * a2
#           X0
#Jan 2021    8
#Jan 2022   15

Best as I can tell, the same is true for any column that can be made into a number (including decimals)

a2 <- xts(matrix(c(4,5,6), dimnames = list(NULL, "1.3")), order.by = as.yearmon(c(2021, 2022, 2023)))
a1 <- xts(matrix(c(1,2,3), dimnames = list(NULL, "1.3")), order.by = as.yearmon(c(2020, 2021, 2022)))
a1 * a2
#         X1.3
#Jan 2021    8
#Jan 2022   15

Session Info

R version 4.0.5 (2021-03-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 21.04

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] xts_0.12.1 zoo_1.8-9 

loaded via a namespace (and not attached):
[1] compiler_4.0.5  tools_4.0.5     grid_4.0.5      lattice_0.20-41
joshuaulrich commented 3 years ago

Thanks for the report! I wouldn't say this is inexplicable. The column names are changed because 0 isn't a valid R object name. This behavior is consistent with data.frame. That said, it's inconsistent with zoo so it should probably be changed in xts.

a1 <- zoo(matrix(c(1,2,3), dimnames = list(NULL, "1.3")), order.by = as.yearmon(c(2020, 2021, 2022)))
a2 <- zoo(matrix(c(4,5,6), dimnames = list(NULL, "1.3")), order.by = as.yearmon(c(2021, 2022, 2023)))
a1*a2
##          1.3
## Jan 2021   8
## Jan 2022  15

I'll have to think about it and do reverse dependency checks though because changing it could very well break existing code.

PS: Please make sure you're using the latest development version of xts before submitting an issue. It's possible the issue has already been fixed. That isn't true in this case, but could be in the future. I don't want you to spend time hunting for something that's already been fixed.

minimenchmuncher commented 3 years ago

You're right on two counts - when you mention about data.frame, this behavior makes a lot more sense. My bad on not sending this report with the development version, will do that for next time.

For my end, I'm going to add an extra layer using make.names() so this doesn't happen in my code.

This actually did come about in my code when I had an S4 class that had zoo as a slot, and I tested using zoo objects and not xts objects. Now as for what zoo is doing, there are other inconsistencies with column naming other than this one it seems. like in the above examples if a1 and a2 are xts as above, cbind(a1, a2) yields column names X0 and X0.1 which again is consistent with how it would be done if they were data.frames, whereas zoo would yield 0.a1 and 0.a2.

joshuaulrich commented 3 years ago

Now as for what zoo is doing, there are other inconsistencies with column naming other than this one it seems.

It would be great if you could list the inconsistencies you find. I can't guarantee anything will change, but at least we'll know where the inconsistencies are and can document them in the package for other users.

minimenchmuncher commented 3 years ago

I can start assembling that list - where would you like it to go? Maybe a new vignette?

Last thing I'll add is part of the issue is inconsistency within xts itself - doing ops on two xtss yielded different results depending on the index- it would change them only if the indices were different. Cursory glance at the code, it looks like the culprit here is lines 35-45 of Ops.xts.R - if the indices are identical, there's a call to NextMethod; if not, it goes to merge.xts, and from there mergeXts (in C).

Perhaps at least this behavior could be made internally consistent?

joshuaulrich commented 3 years ago

I can start assembling that list - where would you like it to go?

Add it as a comment to this issue, to start. We can decide what to do with it from there, once we know what you find.

Perhaps at least this behavior could be made internally consistent?

Agreed. That's what I meant by saying, "it's inconsistent with zoo so it should probably be changed in xts." But I have to see how much it would break existing code before determining how/if to make the change.

minimenchmuncher commented 3 years ago

Got it, will do; will comment here with what I find.

joshuaulrich commented 1 year ago

I am closing because this is a duplicate of #114 (which is now fixed). So your examples now provide the output you expected.