Closed erex closed 4 years ago
## ---- readin, message=FALSE------------------------------------------------------------------------
library(Distance)
data("DuikerCameraTraps")
## ----fit-------------------------------------------------------------------------------------------
conversion <- convert_units("meter", NULL, "square kilometer")
trunc.list <- list(left=2, right=15)
peak.hn <- ds(DuikerCameraTraps, transect = "point", key="hn", adjustment="herm",
cutpoints = c(seq(2,8,1), 10, 12, 15), truncation = trunc.list)
peak.unicos <- ds(DuikerCameraTraps, transect = "point", key="unif", adjustment = "cos",
cutpoints = c(seq(2,8,1), 10, 12, 15), truncation = trunc.list)
peak.hr <- ds(DuikerCameraTraps, transect = "point", key="hr", adjustment = "cos",
cutpoints = c(seq(2,8,1), 10, 12, 15), truncation = trunc.list)
## ---- sumtab---------------------------------------------------------------------------------------
knitr::kable(summarize_ds_models(peak.hn, peak.unicos, peak.hr), digits = 3,
caption="Model selection for three key functions fitted to duiker peak activity data")
## --------------------------------------------------------------------------------------------------
p_a <- peak.hr$ddf$fitted[1]
w <- 15
rho <- sqrt(p_a * w^2)
## --------------------------------------------------------------------------------------------------
par(mfrow=c(1,2))
plot(peak.hr, main="Peak activity", xlab="Distance (m)",
showpoints=FALSE, lwd=3, xlim=c(0, 15))
plot(peak.hr, main="Peak activity", xlab="Distance (m)", pdf=TRUE,
showpoints=FALSE, lwd=3, xlim=c(0, 15))
## ---- echo=FALSE-----------------------------------------------------------------------------------
par(mfrow=c(1,1))
## ---- sampfrac-------------------------------------------------------------------------------------
viewangle <- 42 # degrees
samfrac <- viewangle / 360
conversion <- convert_units("meter", NULL, "square kilometer")
peak.hr.dens <- dht2(peak.hr, flatfile=DuikerCameraTraps, strat_formula = ~1,
sample_fraction = samfrac, er_est = "P2", convert_units = conversion)
print(peak.hr.dens, report="density")
## ---- bootstrap, results='hide'--------------------------------------------------------------------
viewangle <- 42 # degrees
samfrac <- viewangle / 360
mysummary <- function(ests, fit){
return(data.frame(Dhat = ests$individuals$D$Estimate))
}
duiker.boot.hr <- bootdht(model=peak.hr, flatfile=DuikerCameraTraps, resample_transects = TRUE,
nboot=250, summary_fun=mysummary, sample_fraction = samfrac,
convert.units = conversion)
## ----bootresult------------------------------------------------------------------------------------
print(summary(duiker.boot.hr))
## ---- fig.width=8, fig.cap="Distribution of density estimates from bootstrap replicates."----------
hist(duiker.boot.hr$Dhat[duiker.boot.hr$Dhat<100], breaks = 20,
xlab="Estimated density", main="D-hat estimates bootstraps")
abline(v=quantile(duiker.boot.hr$Dhat, probs = c(0.025,0.975)), lwd=2, lty=3)
Minor comments about speed and output (not addressing the main point, which I'll leave for @dill!): Yup, it's a big dataset! Could speed up a bit by specifying no adjustments, since none selected in original analysis. Longer term, it would be good to have an option to parallelize the bootstrap - add as an enhancement, @dill? If this is going into a vignette, @erex, you could consider specifying the P3 er.var in the first runs to suppress the warning message (and, @dill, you could potentially remove that warning in future releases since it specifies in the help that P3 is the default for point transects).
To answer @erex's question first: bootdht
only calls dht
not dht2
.
You are right, the same bug is in dht
as in dht2
see here, there is another mention of the same code snippet here but I believe that this is correct as it's finding the covered area, so needs to have left truncation included.
This is a bit of a horrendous dataset to do any testing on. For future reference, in case it's useful to do simple checks an easy option is to use debug::mtrace
to get inside bootdht
then mtrace
the bootit
function within, you can then replace the data with the original data to see if the results match dht2
(which they didn't in this case).
Regarding the warning message, it originates in dht
, and is there in case one specifies a non-default option (like say O2) which you can't use with points. I'll look into a better way of presenting this in future and move this to a separate issue for next release.
@erex I'm wondering if it is your intention to re-run the above simulation to check the above. If so, perhaps I can write a rough version of the parallelisation code (not to be included in the 1.0.1 release) to speed things up and then run this on bluewhale on a lot of cores. That way we can get this wrapped-up and the releases of mrds
and Distance
off to CRAN today.
@dill Happy to re-run the duiker example just now. I was thinking you might be changing the version number on mrds before I did so allowing me to reference that versioning number in a report back on bootstrap.
Rather than hassle you with a quick-and-dirty bootstrap, If I start now, maybe 100 reps could get done by mid-afternoon. OK?
Sorry, forgot to bump the version. Now done.
That sounds fine as a plan.
ok: installing 2.2.2.9009 just now, will hit the run button
@dill @lenthomas Here's a positive development: the bootstrap with 100 reps only took a bit over an hour (turning off adjustments must have had a large effect). Based upon the 100 reps, the mean and median of the sampling distribution for D-hat is much closer to the point estimate produced by the single call to dht2
see figure below
recall dht2 reports D-hat = 14.51, with CI= (7.8, 26.9)
That's great, thanks Eric. I'll close this in that case.
I'll also add a note to the bootdht
manual page to let people know that adjustments may add a significant amount time to their bootstraps.
@dill
I ask because this commit fixed the left truncation problem for the duikers, such that
dht2
produces the correct point estimate of density.But is this correct version of
dht2
actually used bybootdht
? My reading of bootit withinbootdht
does not invokedht2
I raise this because running a sufficient number (250) replicates of the duiker analysis produces a mean of the sampling distribution that is 50% larger than the point estimate produced by
dht2
I'm not sure if this difference between point estimate and mean of sampling distribution is just chance, but the sampling distribution mean is similar to the point estimate that was calculated prior to the correction applied to
dht2
in the 13July20 commit. Is it coincidence? I don't know.