jmsigner / amt

37 stars 13 forks source link

unnest producing a list of class NULL #80

Closed VirginiaMorera closed 1 year ago

VirginiaMorera commented 1 year ago

I am not sure this is so much an amt issue or a tidyr issue, but I've encountered it only with tracks, so let's start here. When working with nested tracks, any calculation (e.g. step_lengths) done in the nested tracks creates a new "list-like" column outside the "data" column, with the same number of rows. I am not sure that is the desired outcome, I'd rather have it added to the nested dataset inside the "data" column, but that's fine, it can be easily solved by unnesting and nesting again (in theory). However, unnest produces this weird list-like object of class NULL, and although I have managed to find a workaround to turn it back into a dataframe, then I need to use make_tracks and nest again to go back to the starting point, which is annoying. Code below with a reprex

First make the dataset

library(random)
test_df <- data.frame(id = rep(c("A", "B", "C"), each = 100), 
                  lon = runif(min = -9, max = -6, n = 300),
                  lat = runif(min = 51, max = 55, n = 300), 
                  covar1 = runif(min = 0, max = 100, n = 300), 
                  covar2 = c(randomStrings(n=300, len=3, unique = F, 
                                         upperalpha = T, loweralpha = F, 
                                         digits = F)))

Now make it a nested track

test_tracks <- test_df %>% 
  make_track(lon, lat, 
             all_cols = T,
             crs = 4326, 
             check_duplicates = FALSE, 
             verbose = TRUE) %>% 
  nest(data = c(-id))

Calculate step lengths

test_tracks2 <- test_tracks %>% 
  mutate(sl = map(data, step_lengths))

This produces a nested df that has an additional "list" column, sl

> str(test_tracks2, 2)
nested_track [3 × 3] (S3: nested_track/tbl_df/tbl/data.frame)
 $ id  : chr [1:3] "A" "B" "C"
 $ data:List of 3
 $ sl  :List of 3

Unnest

test_tracks3 <- test_tracks2 %>% 
  unnest(cols = c(data, sl))

This produces a list, but of class NULL

> class(test_tracks3)
[1] "NULL"
> str(test_tracks3)
List of 6
 $ id    : chr [1:300] "A" "A" "A" "A" ...
 $ x_    : num [1:300] -7.67 -7.96 -8.39 -7.26 -8.49 ...
 $ y_    : num [1:300] 54.3 53.6 54.8 51.9 52.2 ...
 $ covar1: num [1:300] 55.028 0.772 76.037 64.362 34.512 ...
 $ covar2: chr [1:300] "KRU" "RJL" "RVG" "BUW" ...
 $ sl    : num [1:300] 0.718 1.28 3.087 1.261 0.648 ...
 - attr(*, "class")= chr "NULL"
 - attr(*, "row.names")= int [1:300] 1 2 3 4 5 6 7 8 9 10 ...

I can turn this list back to a normal df with do.call, but other dplyr or tibble methods won't work

x <- as.data.frame(do.call(cbind, test_tracks3))
> str(x)
'data.frame':   300 obs. of  6 variables:
 $ id    : chr  "A" "A" "A" "A" ...
 $ x_    : chr  "-7.67035690904595" "-7.96027435711585" "-8.38539077946916" "-7.25699410191737" ...
 $ y_    : chr  "54.2703927513212" "53.6132442755625" "54.8205336350948" "51.9466292150319" ...
 $ covar1: chr  "55.0277820788324" "0.772069441154599" "76.0366528760642" "64.3617564812303" ...
 $ covar2: chr  "KRU" "RJL" "RVG" "BUW" ...
 $ sl    : chr  "0.718259177377767" "1.27994983112401" "3.08749181012144" "1.26120128657495" ...

However it is a plain df now, and all the variables have lost their format (they are all chr) which is not ideal.

Any idea why this is happening and how to fix it?

jmsigner commented 1 year ago

I usually add a column in the map call. Something like this:

test_tracks %>% 
  mutate(data = map(data, ~ .x %>% mutate(sl = step_lengths(.)))) |> 
  unnest(cols = data)

# Or with native pipes

test_tracks |> 
  mutate(data = map(data, ~ .x |> (\(x) mutate(x, sl = step_lengths(x)))())) |> 
  unnest(cols = data)

A function like add_sl() would be nice, but I did not find time to implement it.