mablab / sftrack

sftrack: Modern classes for tracking and movement data
https://mablab.org/sftrack/
Other
53 stars 2 forks source link

[Use Case] Major functionality issue: concerting data to `sftrack` breaks functionality of parent class `sf` or `sfc` #39

Closed rgzn closed 2 years ago

rgzn commented 2 years ago

I process animal tracking data using sf and just have an extra column for time with entries of type <dttm>. This gives me a ton of functions associated with spatial data, and tidyverse data processing methods.

I am lookeing at sftrack because it would be convenient to have built in track metrics that treat the time as integral to the data much as sf treats the geometry.

However, I've found that it is not worth it because converting objects to sftrack breaks almost all the functions associated with sf objects, including dplyr verbs and mapview.

Because sftrack inherits from sf, I don't see why it shouldn't work with all of these, but it appears that it has something to do with the sftrack idea of "groups". I don't really understand this or how it's advantageous over dplyr groups.

Examples:

> racc_track %>% dplyr::select(height)
Error in UseMethod("group_labels", object = x) : 
  no applicable method for 'group_labels' applied to an object of class "NULL"
> racc_track %>% mapview::mapview()
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘mapView’ for signature ‘"sftrack"’
basille commented 2 years ago

Thanks @rgzn for your report, and for your interest in sftrack. This is unfortunately a known (and long-standing) issue. Let me clarify briefly a few points:

Now the rationale behind having our own grouping class is that tracking data involve a very specific grouping, with at least the ID of the individual, but also possibly other groups (period of time, behavior types, devices, etc.) according to the user's needs, with also an active grouping. The grouping structure of dplyr objects would not allow this fine-grain grouping, and we decided not to use it (with the consequence of the issue you raised). sftrack groupings are explained here.

Altogether, we'd be happy to consider PR about the compatibility with dplyr.

Bevann commented 2 years ago

Seconding the usefulness of being able to convert a sfrack/sftraj to a sf object. While the details of the sftraj/sftrack are great for analysis, being able to extract the linestrings or points as a sf would increase the options for displaying the data in things like shiny apps. Any of my attempts to convert it to a simple sf have thus far been unsuccessful

henningte commented 2 years ago

@Bevann , you could use the sftime package to convert sftrack and sftraj objects to sftime objects (using st_as_sftime()) which do not break any sf functionality.

sftime objects are just sf objects where one column is defined as storing time information. sftime is not meant to provide the dedicated trajectory analysis functions from sftrack, but just a simple class to store irregular spatiotemporal data.

@basille , perhaps it could be useful to have a conversion method sftime -> sftrack/sftraj?

Here are the examples from @rgzn :

library(sftrack)
library(sftime)
#> Loading required package: sf
#> Linking to GEOS 3.9.1, GDAL 3.3.2, PROJ 7.2.1; sf_use_s2() is TRUE
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(mapview)

# get an sftrack object
data("raccoon", package = "sftrack")

raccoon$timestamp <- as.POSIXct(raccoon$timestamp, "EST")

burstz <- 
  list(id = raccoon$animal_id, month = as.POSIXlt(raccoon$timestamp)$mon)

raccoon_sftraj <- 
  as_sftraj(raccoon,
            time = "timestamp",
            error = NA, coords = c("longitude", "latitude"),
            group = burstz
  )

# convert to sftime
raccoon_sftime <- sftime::st_as_sftime(raccoon_sftraj)

# dplyr::select(). This drops the time column and returns an sf object since the time column is not sticky. To keep it, 
# explicitly do `raccoon_sftime %>% dplyr::select(height, timestamp)`
raccoon_sftime %>% dplyr::select(height)
#> Simple feature collection with 445 features and 1 field (with 168 geometries empty)
#> Geometry type: GEOMETRY
#> Dimension:     XY
#> Bounding box:  xmin: -80.28149 ymin: 26.06761 xmax: -80.27046 ymax: 26.07706
#> CRS:           NA
#> First 10 features:
#>    height                       geometry
#> 1      NA                    POINT EMPTY
#> 2       7     POINT (-80.27906 26.06945)
#> 3      NA                    POINT EMPTY
#> 4      NA                    POINT EMPTY
#> 5     858 LINESTRING (-80.27431 26.06...
#> 6     350 LINESTRING (-80.2793 26.068...
#> 7      11 LINESTRING (-80.27908 26.06...
#> 8       9     POINT (-80.27902 26.06963)
#> 9      NA                    POINT EMPTY
#> 10     NA LINESTRING (-80.279 26.0698...

# mapview::mapview(): requires still some nasty workaround since there is no 
# dedicated mapview method for sftime objetcs. But it works.
raccoon_sf <- raccoon_sftime
class(raccoon_sf) <- setdiff(class(raccoon_sf), "sftime")
raccoon_sf %>% mapview::mapview()
#> Warning message:
#> In clean_columns(as.data.frame(obj), factorsAsCharacter) :
#>  Dropping column(s) sft_group of class(es) c_grouping

Rplot

Created on 2022-05-12 by the reprex package (v2.0.1)

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.0 (2022-04-22 ucrt) #> os Windows 10 x64 (build 22000) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate German_Germany.utf8 #> ctype German_Germany.utf8 #> tz Europe/Berlin #> date 2022-05-12 #> pandoc 2.17.1.1 @ C:/Program Files/RStudio/bin/quarto/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> base64enc 0.1-3 2015-07-28 [1] CRAN (R 4.2.0) #> class 7.3-20 2022-01-16 [2] CRAN (R 4.2.0) #> classInt 0.4-3 2020-04-07 [1] CRAN (R 4.2.0) #> cli 3.3.0 2022-04-25 [1] CRAN (R 4.2.0) #> codetools 0.2-18 2020-11-04 [2] CRAN (R 4.2.0) #> colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.2.0) #> crayon 1.5.1 2022-03-26 [1] CRAN (R 4.2.0) #> crosstalk 1.2.0 2021-11-04 [1] CRAN (R 4.2.0) #> DBI 1.1.2 2021-12-20 [1] CRAN (R 4.2.0) #> digest 0.6.29 2021-12-01 [1] CRAN (R 4.2.0) #> dplyr * 1.0.9 2022-04-28 [1] CRAN (R 4.2.0) #> e1071 1.7-9 2021-09-16 [1] CRAN (R 4.2.0) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0) #> evaluate 0.15 2022-02-18 [1] CRAN (R 4.2.0) #> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0) #> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0) #> generics 0.1.2 2022-01-31 [1] CRAN (R 4.2.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0) #> highr 0.9 2021-04-16 [1] CRAN (R 4.2.0) #> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.2.0) #> htmlwidgets 1.5.4 2021-09-08 [1] CRAN (R 4.2.0) #> KernSmooth 2.23-20 2021-05-03 [2] CRAN (R 4.2.0) #> knitr 1.39 2022-04-26 [1] CRAN (R 4.2.0) #> lattice 0.20-45 2021-09-22 [2] CRAN (R 4.2.0) #> leafem 0.2.0 2022-04-16 [1] CRAN (R 4.2.0) #> leaflet 2.1.1 2022-03-23 [1] CRAN (R 4.2.0) #> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.2.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0) #> mapview * 2.11.0 2022-04-16 [1] CRAN (R 4.2.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0) #> pillar 1.7.0 2022-02-01 [1] CRAN (R 4.2.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0) #> png 0.1-7 2013-12-03 [1] CRAN (R 4.2.0) #> proxy 0.4-26 2021-06-07 [1] CRAN (R 4.2.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.2.0) #> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.2.0) #> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.2.0) #> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.2.0) #> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.2.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0) #> raster 3.5-15 2022-01-22 [1] CRAN (R 4.2.0) #> Rcpp 1.0.8.3 2022-03-17 [1] CRAN (R 4.2.0) #> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.2.0) #> rlang 1.0.2 2022-03-04 [1] CRAN (R 4.2.0) #> rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.2.0) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.2.0) #> satellite 1.0.4 2021-10-12 [1] CRAN (R 4.2.0) #> scales 1.2.0 2022-04-13 [1] CRAN (R 4.2.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0) #> sf * 1.0-7 2022-03-07 [1] CRAN (R 4.2.0) #> sftime * 0.2.0.9000 2022-05-09 [1] local #> sftrack * 0.5.3 2022-05-12 [1] Github (mablab/sftrack@b36031f) #> sp 1.4-7 2022-04-20 [1] CRAN (R 4.2.0) #> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.2.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.2.0) #> styler 1.7.0 2022-03-13 [1] CRAN (R 4.2.0) #> terra 1.5-21 2022-02-17 [1] CRAN (R 4.2.0) #> tibble 3.1.7 2022-05-03 [1] CRAN (R 4.2.0) #> tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.2.0) #> units 0.8-0 2022-02-05 [1] CRAN (R 4.2.0) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0) #> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0) #> webshot 0.5.3 2022-04-14 [1] CRAN (R 4.2.0) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0) #> xfun 0.30 2022-03-02 [1] CRAN (R 4.2.0) #> yaml 2.3.5 2022-02-21 [1] CRAN (R 4.2.0) #> #> [1] C:/Users/henni/AppData/Local/R/win-library/4.2 #> [2] C:/Program Files/R/R-4.2.0/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
basille commented 2 years ago

Thank you both @Bevann and @henningte for the feedback. I'd like to clarify one thing, because I feel like there are really two separate issues, related to sf and dplyr respectively. This issue started about the first part, i.e. functionality of parent class sf or sfc. As a matter of fact, sftracks are also sf objects, similarly to sftime. So while the grouping of sftrack objects breaks the compatibility with dplyr, I'm not sure there's an issue with sf per se. Could you provide more details (with example) on how sftrack objects don't behave as sf/sfc objects (if necessary, by calling directly the geometry column of a sftrack, or droping the grouping column)?

Now I totally hear the interest of making sftrack compatible with dplyr. This is at the top on the priority list for sftrack, unfortunately there is no resource available at the moment to work on this — PR are absolutely welcome.

Bevann commented 2 years ago

In my case I have a shiny app to show locations of some collared animals in an interactive mapview/leaflet map. I would like to include lines connecting the points that would also include the step length and movement rate. I was leaning towards using as_sftraj for this since it was faster and more concise code than transforming the points to linestring and then splitting them for step length etc... But when I try and pass that back to my renderMapview call they do not play nice together. I have tried dropping the grouping list column, but using various methods it remains and the calss of the object is not an sf so mapview doesn't like it. A simple way to covert the sftrack/traj objects back to a bare bones would be useful for dispalying the data in wider range of formats.

rgzn commented 2 years ago

@Bevann I think uses like yours are very common.

I wonder if a package that deals with sf or sftime objects and computes track stats and such on top of those object would be good? Basically something like {amt} but updated to use sftime as the native underlying data?

henningte commented 2 years ago

@basille , you're right that there are two separate problems.

My "solution" was meant as quick workaround in the current situation where these methods are missing. The mapview example even showed that 'sftime' also "breaks" sf functionality since there exists no dedicated mapview method for sftime objects yet ('sftime' is quite recent).

I think having a conversion method from sftime to sftrack/sftraj could be useful to exploit the complementary aspects of both packages. But I agree that this is a separate issue --- apart from the quick workaround 'sftime' would offer here since 'tidyverse' methods are already implemented for sftime objects and sftrack/sftraj -> sftime conversion methods already exist.

@rgzn , I agree in principle. One problem is that 'sftime' is younger than 'sftrack' (otherwise, sftrack and sftraj would perhaps have been subclasses of sftime). The other problem is that subclassing sftime objects would not solve the issues because one would nevertheless need dedicated sftrack and sftraj methods for 'tidyverse' functions and 'mapview'.

For 'tidyverse' functions, the problem is (as can be seen from the error message in your dplyr::select() example) that you need a wrapper for each 'tidyverse' function to ensure that after the operation the object still is a valid sftrack/sftraj object (and I think for the group column you additionally need methods for some 'vctrs' generics).

In your dplyr::select() example, the wrapper would either need to ensure that the result is simplified to an sf object if the group column is dropped or it would need to ensure a sticky group column. But this would also be the case if sftrack/sftraj were subclasses of sftime --- you nevertheless need the dedicated wrappers for sftrack and sftraj.

For mapview::mapview(), I think the problem is that these are S4 methods and therefore typical S3 method dispatch does not work.

basille commented 2 years ago

Thanks again @Bevann, @rgzn and @henningte for the feedback. My apologies for the long delay, I just couldn't spend any time on sftrack since your last messages.

As we agree that compatibility with dplyr is a different issue (which involves a lot of work on our end to ensure that many functions of dplyr works with sftrack/sftraj objects), let's focus on the sf issue that was raised here. I played a bit with sftrack/sftraj objects, and I can confirm that they are indeed sf objects (which they are by design and construction). The fact the mapview does not handle sftrack/sftraj objects is the issue here, but there are easy workarounds, as was alluded to before.

With examples, let's first build sftrack/sftraj objects:

library("sftrack")
data(raccoon)
raccoon$timestamp <- as.POSIXct(as.POSIXlt(raccoon$timestamp, tz = "EST5EDT"))
my_sftrack <- as_sftrack(
  data = raccoon,
  coords = c("longitude","latitude"),
  time = "timestamp",
  group = "animal_id",
  crs = "+init=epsg:4326")
my_sftraj <- as_sftraj(my_sftrack)

And try to see how it goes with mapview, starting with sftrack objects:

mapview::mapview(my_sftrack)
## Error in (function (classes, fdef, mtable)  :
## unable to find an inherited method for function ‘mapView’ for signature ‘"sftrack"’
mapview::mapview(my_sftrack$geometry)
## Works!

and sftraj objects:

mapview::mapview(my_sftraj)
## Error in (function (classes, fdef, mtable)  :
##   unable to find an inherited method for function ‘mapView’ for signature ‘"sftraj"’
mapview::mapview(my_sftraj$geometry)
## Works!

Alternatively, it's also possible to drop the sftrack/sftraj class, and that works too:

class(my_sftraj) <- setdiff(class(my_sftraj), "sftraj")
mapview::mapview(my_sftraj)

All this said, direct compatibility with mapview might be easier to achieve than dplyr, but it would be better handled from mapview itself. Finally, the development version of sftime now includes coercion from sftrack/sftraj objects! Please check it out:

library("sftime")
my_sftime <- st_as_sftime(my_sftraj)
plot(my_sftime)

I'm closing this issue for now, but feel free to discuss further or reopen if you have additional issues.