tidyverts / tsibble

Tidy Temporal Data Frames and Tools
https://tsibble.tidyverts.org
GNU General Public License v3.0
528 stars 50 forks source link

Select drops key columns when key is unique #249

Closed Fuco1 closed 3 years ago

Fuco1 commented 3 years ago

If a tsibble has two series (key combination results in two series) and I select a data column, the key column is also selected.

However, if I first filter and only one series is present in the result, the key column is dropped.

It would be nice to have an argument to always select key columns such that generic code can be written against the filtered (or non-filtered) subset.

bind_rows(tibble(date = 1:12, group = "a", value = 10, other = 1), tibble(date = 1:12, group = "b", value = 12, other = 4)) %>% as_tsibble(index = date, key = group) %>% select(value)
# A tsibble: 24 x 3 [1]
# Key:       group [2]
   value  date group
   <dbl> <int> <chr>
 1    10     1 a    
 2    10     2 a    
 3    10     3 a    
 4    10     4 a    
 5    10     5 a    
 6    10     6 a    
 7    10     7 a    
 8    10     8 a    
 9    10     9 a    
10    10    10 a

Filtered case:

bind_rows(tibble(date = 1:12, group = "a", value = 10, other = 1), tibble(date = 1:12, group = "b", value = 12, other = 4)) %>% as_tsibble(index = date, key = group) %>% filter(group == "a") %>% select(value)
# A tsibble: 12 x 2 [1]
   value  date
   <dbl> <int>
 1    10     1
 2    10     2
 3    10     3
 4    10     4
 5    10     5
 6    10     6
 7    10     7
 8    10     8
 9    10     9
10    10    10
11    10    11
12    10    12
earowang commented 3 years ago

hmm, it would be a breaking change. For robustness, I'll select() both key and value columns for either case.

Fuco1 commented 3 years ago

What I had in mind with

It would be nice to have an argument to always select key columns [...]

was something like select(data, .with_keys = T). It can default to false so the current behaviour is preserved. I suppose it's not a difficult change since there must be some logic dropping the column already. Should I have a stab at implementing this?

earowang commented 3 years ago

The issue with this new argument is that it will be ignored when selecting multiple keys. It will only takes effect on a single key.