ropensci / spatsoc

:package: spatsoc is an R package for detecting spatial and temporal groups in GPS relocations.
https://docs.ropensci.org/spatsoc
GNU General Public License v3.0
24 stars 2 forks source link

spatial grouping question #28

Closed echinorhinus closed 4 years ago

echinorhinus commented 4 years ago

I have a couple of questions regarding spatsoc. I’m trying to use it to look at whether or not I can run a social network analysis from shark acoustic detection data.

The first question is: When running my own data, I get a warning/error message when running the spatial grouping step (group_pts). The message is as follows:

Warning message: In group_pts(DT1, threshold = 50, id = "ID", coords = c("X", "Y"), : found duplicate id in a timegroup and/or splitBy - does your group_times threshold match the fix rate?

I noticed that if I change the time threshold for the group_times step to 24 hours, I generate the same warning when using the Newfoundland Bog Cow data.

How do I choose a time group to work for my data? Even if I set the time threshold to NULL, I still get the same error message. So far, the only work around I have found is to filter my data to just one detection at any receiver/station per animal, per day, and then set my time threshold to “1 day”. However, this reduces a 10 year dataset from 872644 datapoints to just 5824, which I assume is going to significantly diminish the resolution of the data, and the ability of the model to identify social relationships.

Any help/pointers would be greatly appreciated.

robitalec commented 4 years ago

Hello @echinorhinus! Thanks for reaching out with your question.

The selection of thresholds is discussed more extensively in the manuscript Conducting social network analysis with animal telemetry data: Applications and methods using spatsoc - section 7.

Grouping points with spatsoc is intended to be 1) temporal grouping followed by 2) spatial grouping within temporal groups.

Temporal grouping was originally designed to handle the imprecision of capture at regular intervals for GPS collars, for example 2 hour fix rates might be off by 1 or 2 minutes.

The warning message indicates that your temporal grouping includes the same individual multiple times within a timegroup. This is likely not your intention, do you want to spatially group the same individual with itself? In group_pts(DT1, threshold = 50, id = "ID", coords = c("X", "Y"), : found duplicate id in a timegroup and/or splitBy - does your group_times threshold match the fix rate?

One way to check is:

# Load data.table
library(data.table)

# Read example data
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))

# Cast the character column to POSIXct
DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]

group_times(DT, datetime = 'datetime', threshold = '5 minutes')

DT[, .N, by = .(ID, timegroup)][N >1 ]
Empty data.table (0 rows and 3 cols): ID,timegroup,N

Alternatively with a much larger temporal threshold (greater than the fix rate), we can see multiple observations of the same individual in the same timegroup.

group_times(DT, datetime = 'datetime', threshold = '6 hours')

# Check number of observations of each individual in each timegroup
DT[, .N, by = .(ID, timegroup)][N >1]
          ID timegroup     N
      <char>     <int> <int>
   1:      A         1     2
   2:      A         2     3
   3:      A         3     3
   4:      A         4     3
   5:      A         5     3
  ---                       
4788:      J       476     3
4789:      J       477     3
4790:      J       478     3
4791:      J       479     3
4792:      J       480     3

In your case, maybe you need to check if there are duplicate rows per individual and datetime? What is the temporal resolution/frequency of observation of your data?

DT[, .N, by = .(ID, datetime)][N >1]
echinorhinus commented 4 years ago

Thanks Alec, I'll work through your responses and see what develops. I am trying to examine the role of social structure in co-occurence at aggregation and residency hotspots. My data are passive acoustic telemetry detections across a large (spatially heterogeneous) array, over a number of years, from highly mobile/migratory animals that also display seasonal residency/philopatry. Thus "locations" are temporal instances of presence at a given station. So, temporal resolution and frequency can be highly variable, and depend upon an animal being with detection range of an acoustic receiver. For example, individuals might be detected multiple times an hour or day, across a period of days to months, at a given station, or alternatively may only be detected once or twice. There should not be duplicate rows per individual for the exact same date time, but could be multiple rows for an individual within a given time period (5 mins, 1 hour etc).

robitalec commented 4 years ago

Neat ok. I think you should definitely be able to use spatsoc, it just means you need to find the temporal threshold which is both biologically relevant but also fits within the minimum resolution of your data. Or you could reduce the really high resolution data to be more similar to the moderate resolution (eg. maybe dropping 2 locations for an individual within 5 minutes to 1 location).

echinorhinus commented 4 years ago

Thanks. Will keep at it.