DistanceDevelopment / distance-bugs

A place to keep bugs in Distance
http://distancesampling.org/Distance
1 stars 0 forks source link

summary.dsm.var.movblk not picking up proper CV bits? #151

Open erex opened 9 years ago

erex commented 9 years ago

Running moving block bootstrap via D7B1, which generates this R code

This is mgcv 1.8-7. For overview type 'help("mgcv-package")'.
Loading required package: mrds
This is mrds 2.1.14
Built: R 3.2.1; ; 2015-07-30 07:49:18 UTC; windows
This is dsm 2.2.9
Built: R 3.2.1; ; 2015-07-30 07:47:50 UTC; windows
...
> dsm.var.8<-dsm.var.movblk(dsm.object=dsm.tmp, pred.data = grid.6, n.boot=99, block.size=3, off.set=271.24, samp.unit.name='Bootstrap.Sample.Label',

produces this result (for spotted dolphins)

Boxplot coeff     : 1.5 
Replicates        : 99 
Outliers          : 8 
Infinites         : 0 
NAs               : 0 
NaNs              : 0 
Usable replicates : 91 (91.91919%)
Approximate asymptotic bootstrap confidence interval:
      5%     Mean      95% 
10069.06 26790.74 71282.06 
(Using delta method)

Point estimate                 : 26790.74 
Standard error                 : 14254.63 
CV of detection function       : 0.3762299 
CV from bootstrap              : 0.3762 
Total coefficient of variation : 0.5321 

I am suspicious that CV of detection function = CV from bootstrap to 4 decimal places.

dill commented 9 years ago

Can you confirm which version of dsm is being used in D7B1?

On 04/08/2015 15:13, erex wrote:

Running moving block bootstrap via D7B1, which generates this R code

| > dsm.var.8<-dsm.var.movblk(dsm.object=dsm.tmp, pred.data = grid.6, n.boot=99, block.size=3, off.set=271.24, samp.unit.name='Bootstrap.Sample.Label', |

produces this result (for spotted dolphins)

|Boxplot coeff : 1.5 Replicates : 99 Outliers : 8 Infinites : 0 NAs : 0 NaNs : 0 Usable replicates : 91 (91.91919%) Approximate asymptotic bootstrap confidence interval: 5% Mean 95% 10069.06 26790.74 71282.06 (Using delta method) Point estimate : 26790.74 Standard error : 14254.63 CV of detection function : 0.3762299 CV from bootstrap : 0.3762 Total coefficient of variation : 0.5321 |

I am suspicious that |CV of detection function| = |CV from bootstrap| to 4 decimal places.

— Reply to this email directly or view it on GitHub https://github.com/DistanceDevelopment/distance-bugs/issues/151.

erex commented 9 years ago

Look at the message at the top of my description: 2.2.9 built 30 July 2015

dill commented 9 years ago

Whoops, sorry missed that.

I can't seem to reproduce this on my machine and think this was fixed in 2642497aa5b7a1c05ff3b50fa718a31aaee2b976.

Can you e-mail me the RData files and/or code necessary to reproduce this in R?

erex commented 9 years ago

A lovely 12Mb .Rdata file (duplicating spotted dolphins analysis for Advanced Distance Workshop) can be found at https://www.dropbox.com/s/40uem93s0sj2gmv/.RData?dl=0

and code from D7B1 log file is

file.create('C:\Users\eric\AppData\Local\Temp\dst29146\res.r') [1] TRUE file.create('C:\Users\eric\AppData\Local\Temp\dst29146\stat.r') [1] TRUE

sink(file='C:\Users\eric\AppData\Local\Temp\dst29146\res.r',append=T) win.metafile(filename='visgam8.%02d.wmf',width=7,height=5,pointsize=12) latname<-all.vars(dsm.5$formula)[grep('lat',all.vars(dsm.5$formula))] lonname<-all.vars(dsm.5$formula)[grep('lon',all.vars(dsm.5$formula))] vis.gam(dsm.5, main='dsm.5', too.far=0.05,plot.type='contour', type='response', view=c(lonname,latname),lwd=2,lty=1,pch=1,cex=1) cat('\tResponse Surface/Plot: Visgam-Plot\t\n') dev.off()

grid.8<-read.table(file='C:\Users\eric\AppData\Local\Temp\dst29146\pred.cov.dat.r', header=TRUE, sep='\t', comment.char='') dsm.predict.8<- predict(dsm.5,newdata=grid.8, off.set=271.24)

sink(file='C:\Users\eric\AppData\Local\Temp\dst29146\res.r',append=T) cat('\tResponse Surface/Prediction\t\n')

predict.lyr('C:\Users\eric\AppData\Local\Temp\dst29146\pred.lyr.dat.r',dsm.predict.8) sink() tmpobject<- predict(dsm.5,newdata=grid.8, off.set=1)

write.table(tmpobject,file='C:\Users\eric\AppData\Local\Temp\dst29146\pred.res.r', quote=FALSE, col.names=FALSE, sep='\t')

sink(file='C:\Users\eric\AppData\Local\Temp\dst29146\stat.r',append=T) cat('4030', sum(dsm.predict.8,na.rm=TRUE), '\n') sink()

var.dat<-read.table(file='C:\Users\eric\AppData\Local\Temp\dst29146\var.dat.r', header=TRUE, sep='\t', comment.char='')

sink(file='C:\Users\eric\AppData\Local\Temp\dst29146\res.r',append=T) cat('\tResponse Surface/Variance: Bootstrapped measure of precision\t\n') win.metafile(filename='bootst8.%02d.wmf',width=7,height=5,pointsize=12) dsm.tmp<-dsm.5 dsm.tmp$data<-merge(dsm.tmp$data,var.dat) dsm.var.8<-dsm.var.movblk(dsm.object=dsm.tmp, pred.data = grid.6, n.boot=99, block.size=3, off.set=271.24, samp.unit.name='Bootstrap.Sample.Label', progress.file='C:\Users\eric\AppData\Local\Temp\dst29146\bootprog.txt', bar = FALSE)

On 05/08/2015 13:31, DL Miller wrote:

Whoops, sorry missed that.

I can't seem to reproduce this on my machine and think this was fixed in 2642497aa5b7a1c05ff3b50fa718a31aaee2b976.

Can you e-mail me the RData files and/or code necessary to reproduce this in R?

— Reply to this email directly or view it on GitHub https://github.com/DistanceDevelopment/distance-bugs/issues/151#issuecomment-127981987.

Eric Rexstad 20 West Braes Crescent Crail Fife KY10 3SY

dill commented 9 years ago

Running this code with dsm 2.2.9 from CRAN does not reproduce the result that you have. I don't understand why this is the case.

Have you tried installing the R packages again from scratch or starting with a fresh Distance install?

erex commented 9 years ago

Same Distance project, same data, same analysis, simply re-run (0820 6 August):

Summary of bootstrap uncertainty in a density surface model
Detection function uncertainty incorporated using the delta method.

Boxplot coeff     : 1.5 
Replicates        : 99 
Outliers          : 5 
Infinites         : 0 
NAs               : 0 
NaNs              : 0 
Usable replicates : 94 (94.94949%)
Approximate asymptotic bootstrap confidence interval:
       5%      Mean       95% 
 9640.183 26790.739 74453.325 
(Using delta method)

Point estimate                 : 26790.74 
Standard error                 : 14976.98 
CV of detection function       : 0.3762299 
CV from bootstrap              : 0.4135 
Total coefficient of variation : 0.559 

Could it really have been chance that produced CV(gam)===CV(detfn) to 4 decimal places?? What are the chances of that?

dill commented 9 years ago

I find CV(gam)===CV(detfn) highly unlikely, but I don't see how we can debug something that can't be reproduced. This is definitely unsatisfying.

On 06/08/2015 08:30, erex wrote:

Same Distance project, same data, same analysis, simply re-run (0820 6 August):

|Summary of bootstrap uncertainty in a density surface model Detection function uncertainty incorporated using the delta method. Boxplot coeff : 1.5 Replicates : 99 Outliers : 5 Infinites : 0 NAs : 0 NaNs : 0 Usable replicates : 94 (94.94949%) Approximate asymptotic bootstrap confidence interval: 5% Mean 95% 9640.183 26790.739 74453.325 (Using delta method) Point estimate : 26790.74 Standard error : 14976.98 CV of detection function : 0.3762299 CV from bootstrap : 0.4135 Total coefficient of variation : 0.559 |

Could it really have been chance that produced CV(gam)===CV(detfn) to 4 decimal places?? What are the chances of that?

— Reply to this email directly or view it on GitHub https://github.com/DistanceDevelopment/distance-bugs/issues/151#issuecomment-128277212.