jedalong / wildlifeTG

R Package for Time Geographic Analysis of Wildlife Tracking Data
6 stars 0 forks source link

form of decay function #2

Open mpadge opened 7 years ago

mpadge commented 7 years ago

jed, can you please describe again for me the issue you were having in trying to work out the form of your decay function (or however you conceived of it), by which i mean your uncertainty about whether it should be exponential or some other form. I'm keen to have a bash at that, because it is actually directly related to my current project.

jedalong commented 7 years ago

Hi Mark, here is my work in progress explanation. Note that this is less mathematically elegant than RSP's, however it explicitly considers temporal constraints, which in my view are important, and overlooked by all elegant mathematical solutions that I am aware of.

\subsection{Field-based time geography} The construction of field-based time geography follows classic time geography by considering the intersection of space-time cones. In order to do so consider any intermediate time point $t$ between two anchors $a$ and $b$, where $t_a < t < tb$. For a location (typically a pixel) we define two accumulated cost surfaces: $T{ai}$, which is the cost (in units of time) from location $a$ to location $i$ based on the network $N$ (similarly compute $T{ib}$). If location $i$ is accessible at time $t$ (i.e., $T{ai} \le t-ta$; and $T{ib} \le t_b-t$) then location $i$ is within the potential path space (PPS) at time ($i \in PPS_t$). Any locations outside of the PPS are excluded from further calculations.

To model the probability of travelling from location $[a, t_a]$ through location $[i, t]$ to location $[b, tb]$, a model for what is \textit{expected} is useful. In field-based time geography, the expectation is that the object will follow the trajectory associated with the \textit{shortest-time path} between $a$ and $b$ for which we can compute the time -- $T^*{ab}$. Movement probabilities are then estimated from the deviations, measured as \textit{time}, from the shortest time path. Then for any location $i$ and time $t$ the deviation from the shortest-time path -- $\Delta T_{i,t}$ is defined as:

\begin{equation}\label{eq:1} \Delta T{i,t} = \sqrt{\left(T{ai} - \deltat T^*{ab} \right)^2} + \sqrt{\left(T_{ib} - (1 - \deltat) T^*{ab} \right)^2} \end{equation}

\noindent where $T^*_{ab}$ is time duration associated with the shortest-time path from A to B and $\delta_t = \frac{t - t_a}{t_b - ta}$. Such a formulation assumes the object will move proportionally along the shortest-time path (see the example in Box 1. Such a model draws on the theoretical idea that movement will typically follow the path of least resistance \citep{Haggett1977} which is based on the \textit{principle of least effort} \citep{Zipf1949}. The location $i$ associated with the trajectory of the shortest-time path at time $t$ will have $\Delta T{i,t} = 0$. As a location deviates further from the trajectory associated with the shortest-time path it will have increasing $\Delta T{i,t}$ values. With field-based time geography of interest is an estimate of the \textit{probability} an object was at a location at a given time --- $P{i,t}$. Thus, we must define a function to transform the time deviations ($\Delta T_{i,t}$) from equation (\ref{eq:1}) into probability values.

There are, however, many potential mathematical functions that we could use to define $P_i,t$ \citep[see Table 1 which is developed after ][]{Taylor1971, Haggett1977}. The most straightforward way to model movement probabilities in the field-based space-time prism is to estimate the probability the individual visited location $i$ at time $t$ as proportional to the inverse of the time deviation. However, \citep{Haynes2003} discusses the growing trend to use inverse-squared functions, typically in spatial interaction models. Alternatively, negative exponential functions have the firmest theoretical foundation for modelling the decreasing activities as a function of distance, cost or time \citep{Haynes2003,Handy1997,Wilson1967}.

\begin{table}[h] \caption{Potential functions used to derive probabilities from $\Delta T_i(t)$ in field-based time geography.} \label{tbl:1} \begin{tabular}{lc} \hline Function & Formula* \ \hline Inverse & $\frac{c_1}{c_2+s}$\ Inverse-squared & $\frac{c_1}{c_2+s^2}$ \ Exponential & $ e^{-c_2 s}$ \ Normal & $c_1 e^{-c_2 s^2}$ \ Root exponential & $c_1 e^{-c_2 s^{\frac{1}{2}}}$ \ Log-Normal & $c_1 e^{-c_2 (\log s)^2}$ \ \hline \end{tabular} \end{table}

\noindent The scaling parameter $c1$ is used to standardize the $P{i,t}$ so that $\sum P{i,t} = 1$ at any time $t$. Scaling the $P{i,t}$ in such a way via $c_1$ accounts for variations in the size and structure of the $PPS_t$ \citep[see][]{Winter2011,Song2014}.

\begin{equation} \label{eq:6} c1 = \frac{1}{\sum{\forall j} P_{j,t}} , \quad j \in PPS_t \end{equation}

The tuning parameter ($c_2 \ge 0$) has a significant influence on the modelled probabilities in the fbtg model. The tuning parameter, $c_2$, is a decay parameter controlling the strength of the decay function (from Table 1). Lower values ($c_2 \approx 0$) are used to model weaker decay and thus model locations deviating from the shortest-time path with higher probabilities. Higher values ($c_2 gg 0$) are associated with stronger decay and thus model locations deviating further from the shortest-time path with much lower probabilities. In nearly all applied scenarios $c_2$ will be unknown, but can be empirically estimated from the data (e.g., GPS tracking data) using a leave-one-out numerical estimation procedure \citep[similar to that proposed by ][ for Brownian bridges]{Horne2007}.

The $P{i,t}$ can be used to study the internal movement probabilities within field-based space-time prisms. Several types of further analysis to allow the $P{i,t}$ to be analyzed more practically. First, a map of the $P{i,t}$ for any given $t$ can be used to quantify movement potential at a specific time. Both Winter \& Yin \citep{Winter2010} and \citep{Song2014} use incremental maps to demonstrate how $P{i,t}$ change through time within a space-time prism. Such a mapping is useful to visualize and analyze the potential movement probabilities at a particular time.

Calculating the cumulative visit probability ($P_i$) for any location $i$ over the entire time interval between $t_a$ and $tb$ can be done by integrating the $P{i,t}$ over time.

\begin{equation} \label{eq:7} Pi = \int{t_a}^{tb} P{i,t}\quad dt \end{equation}

\noindent In practice, the integral in equation (\ref{eq:7}) is not easy to calculate, but can be approximated numerically by taking a set of equally spaced times between $t_a$ and $t_b$ (i.e., $t_a < t_k < t_b$) and performing numerical integration using the trapezoid rule. For any space-time prism the sum of the $P_i$ is equal to the time budget of the prism, that is $\sum P_i = t_b - t_a$. This definition of $P_i$ is powerful because it facilitates easy interpretation of modelled probabilities relative to the overall time budget, and can be interpreted as the expected value of time spent at each location or as relative visit probabilities. The map of the $P_i$ for the entire space-time prism represents the probabilistic version the potential path area --- the projection of the space-time prism onto the spatial plane.

jedalong commented 7 years ago

Related to above, I think I will find (after I finish marking a pile of papers) that it is not the shape of the funciton that matters so much, but rather the value of c_2 is far more important. This is similar to many other types of analysis, like Kernel Density Estimation for example.