DistanceDevelopment / distance-bugs

A place to keep bugs in Distance
http://distancesampling.org/Distance
1 stars 0 forks source link

Estimation of encounter rate variance (Bugzilla #92) #117

Closed dill closed 9 years ago

dill commented 9 years ago

status RESOLVED severity critical in component 06) CDS - Conventional distance sampling for --- Reported in version 6.0 Release 2 on platform All Assigned to: Laura Marshall

On 2011-04-07 21:39:18 +0100, Tim Ritter wrote:

I found a bug regarding the estimation of encounter rate variance. This bug may figure out to be a serious problem for many DISTANCE users as it occurs when working with the default settings and affects the estimation of precision. I will explain the problem at the “Point Transect Example” dataset included in the sample projects collection provided with the software. For the example I used the CDS engine, however the problem also occurred with the MCDS engine.

For the analytic estimation of encounter rate variance under the assumption of random sampler placement two estimators are available: 1st the Design-derived estimator P2 (Fewster et al. 2009, Web Appendix B, eq.24 ), 2nd the model-derived estimator P3 (Fewster et al. 2009: Web Appendix B, eq.25), which is equivalent to eq. 3.79 of Buckland et al. (2001:79). From DISTANCE 6.0 on, P2 is the default setting, whereas older Versions use P3 as default. Both estimators reduce to Estimator P1 (Fewster et al. 2009, Web Appendix B, eq.23 ) if sampling effort is the same for all points (t_i = t for all i). This is the case for the “Point Transect Example” dataset. (Here sampling effort is 1 for all points (t_i =t= 1 for all i), so P1 could even be further simplified by dropping t^2 in the denominator, but that’s not really important for the problem). The gist of the matter is, that both estimators (P2 and P3) should lead to exactly the same results in the case of this dataset - but they don’t.

When fitting a half normal detection function without adjustment terms and without truncation, I obtained the following values:

Estimator | n/k | %CV P2 | 4.80 | 3.75 P3 | 4.80 | 8.84 Bootstrap (999 resamples) | 4.79 | 8.44

The variance of the object density var(D) is estimated by the delta method (Seber 1982: 7-9), one component of this estimator is the variance of n, which can be estimated by eq. 3.79 of Buckland et al. (2001:79). Because this equation is equivalent to the model-derived estimator P3 and therefore P2 in the case of this dataset, also the estimation of var(D) is affected, as can be seen below:

Estimator | D | %CV
P2 | 79.63 | 9.68
P3 | 79.63 | 12.56
Bootstrap (999 resamples) | 77.86 | 14.60

At a first glance, the results of P3 seem to be more convincing, because they match the results obtained by bootstrapping. For being sure about that, I wrote a SAS macro to calculate P2 and P3 outside of DISTANCE, for both estimators I came to the same results as with P3 inside DISTANCE.

I replicated the analysis explained above with three different datasets and the results were all the same: My SAS macro and the DISTANCE Version of P3 lead to the same results, which matched the results obtained by bootstrapping. The DISTANCE Version of P2 lead to much lower Variance in all cases.

Kind regards, Tim

Tim Ritter, M.Sc.

Georg-August-University Department Ecoinformatics, Biometrics and Forest Growth

Büsgenweg 4 37077 Göttingen GERMANY

Fon: +49 (0)551 39-3462 Fax: +49 (0)551 39-3465

On 2011-04-08 16:16:32 +0100, Laura Marshall wrote:

We have confirmed that there is a problem with the P2 estimator in the CDS analysis engine within Distance. We aim to resolve this issue and release a new version of Distance in the near future.

On 2011-04-13 04:49:53 +0100, Len Thomas wrote:

The problem was that for estimator P2, a routine was returning the variance of the estimated encounter rate, when it should have been the standard error -- i.e., a square root was not being taken.

Typically the standard error on encounter rate will be less than 1, and hence the variance is smaller, and hence using variance instead of standard error has caused Distance to report a standard error and CV that is lower than it should be.

We will issue a new release of Distance shortly. Since P2 is the default estimator of encounter rate variance, we recommend that CDS and MCDS point transect analyses that were run in Distance 6.0 Release 1 and 2 be re-run in the new release to check the analytic variance estimator.

We apologize for any problems caused by this bug.

On 2014-04-22 21:52:10 +0100, Len Thomas wrote:

Fixed in Distance 6.2