DistanceDevelopment / mrds

R package for mark-recapture-distance-sampling analysis
GNU General Public License v3.0
4 stars 4 forks source link

Mcds dot exe #69

Closed dill closed 1 year ago

dill commented 1 year ago

Initial working MCDS.exe interface. Only tested in ubuntu docker container. Need to test on Windows.

codecov-commenter commented 1 year ago

Codecov Report

Merging #69 (806172e) into master (ac0dacd) will decrease coverage by 1.36%. The diff coverage is 1.77%.

:exclamation: Current head 806172e differs from pull request most recent head 357143f. Consider uploading reports for the commit 357143f to get more accurate results

:exclamation: Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##           master      #69      +/-   ##
==========================================
- Coverage   30.07%   28.72%   -1.36%     
==========================================
  Files         158      159       +1     
  Lines        5911     6187     +276     
==========================================
- Hits         1778     1777       -1     
- Misses       4133     4410     +277     
Files Changed Coverage Δ
R/ddf.R 94.28% <ø> (ø)
R/mcds_tools.R 0.00% <0.00%> (ø)
R/ddf.ds.R 76.92% <38.46%> (-4.44%) :arrow_down:

... and 3 files with indirect coverage changes

LHMarshall commented 1 year ago

Checking if the proposed workflow will be inline with CRAN policies

Relevant points

Other people doing similar things

https://www.reddit.com/r/rstats/comments/aqjtnx/distributing_a_compiled_executable_with_an_r/

https://stackoverflow.com/questions/8977346/distributing-a-compiled-executable-with-an-r-package

LHMarshall commented 1 year ago

Testing to do

image image

Also tried running from mrds folder when based in that folder

image

Oh well that is interesting! Downloaded file via R function not same size as in Distance installation... yup when I copy the MCDS.exe from the Distance installation into mrds I can run MCDS from the mrds folder.

image

WHOOPPEEEE!!!!

image
erex commented 1 year ago

@LHMarshall I see you are leaving comments in the pull (#69), while I'm leaving comments in the issue Distance #153. Where would you prefer information be stored?

LHMarshall commented 1 year ago

@erex this branch should work on windows now

LHMarshall commented 1 year ago

To do list:

Startup message when MCDS.exe not present

image

Startup message when MCDS.exe is present

image

Code to download added to mcds-dot-exe help which is linked from ddf help (feel like there should be a link from elsewhere but not sure where)

image image

Output detailing optimisation

image
erex commented 1 year ago

Tried to run amakihi in the three variants of fitting (default, Ronly, FORTRAN only). Could not run covariate models in FORTRAN, errors in Quarto output below.

mcds-exe-amakihi.pdf

Error appears to trace to here. Apparently covariate names in data frame converted to lower case, which is not the situation with the covariate names in the amakihi data set.

Not because of lower case, but rather because there are fields in the data named "OBs" as well as "observer" and the grep command returns both indexes, when only one is expected.

I've not traced the reasons for the other warning messages.

LHMarshall commented 1 year ago

@erex @lenthomas

Current state of play is that Eric's .Rmd file is now almost running. A remaining issue is that Distance tries to get MCDS.exe to fit 5 adjustment terms and it can only fit 4 so set max_adjustments to 4 for the moment. I'm still getting some warnings that need some investigating however. I'm confused as to what status 1 is - is that ok? It's generating a warning

> amak.hn.R <- ds(amakihi, transect="point", key="hn", convert_units = conv, 
+               truncation=82.5, skip_mcds=TRUE, skip_R=FALSE, max_adjustments = 4)
Starting AIC adjustment term selection.
Fitting half-normal key function
AIC= 10833.841
Fitting half-normal key function with cosine(2) adjustments
Warning: 'length(x) = 1764 > 1' in coercion to 'logical(1)'AIC= 10820.154
Fitting half-normal key function with cosine(2,3) adjustments
Warning: 'length(x) = 1849 > 1' in coercion to 'logical(1)'AIC= 10809.39
Fitting half-normal key function with cosine(2,3,4) adjustments
Warning: 'length(x) = 1936 > 1' in coercion to 'logical(1)'AIC= 10799.122
Fitting half-normal key function with cosine(2,3,4,5) adjustments
Warning: 'length(x) = 2025 > 1' in coercion to 'logical(1)'Warning: Detection function is not strictly monotonic!AIC= 10799.105
Warning: Detection function is not strictly monotonic!

> amak.hr.obs.F <- ds(amakihi, transect="point", key="hr", formula=~OBs, convert_units = conv, 
+                   truncation=82.5, skip_mcds=FALSE, skip_R=TRUE)
Model contains covariate term(s): no adjustment terms will be included.
Fitting hazard-rate key function
Warning: running command 'C:/Users/lhm/AppData/Local/R/win-library/4.2/mrds/MCDS.exe 0, C:\Users\lhm\AppData\Local\Temp\Rtmp2nCjvl\cmdtmp42745c7f51d4.txt' had status 1AIC= 10932.267

Also concerning is the fact that MCDS.exe seems to do better on hn model but is not being selected

image
> amak.hn <- ds(amakihi, transect="point", key="hn", convert_units = conv, truncation=82.5, max_adjustments = 4)
Starting AIC adjustment term selection.
Fitting half-normal key function
AIC= 10833.841
Fitting half-normal key function with cosine(2) adjustments
AIC= 10820.154
Fitting half-normal key function with cosine(2,3) adjustments
AIC= 10809.39
Fitting half-normal key function with cosine(2,3,4) adjustments
AIC= 10799.122
Fitting half-normal key function with cosine(2,3,4,5) adjustments
Warning in check.mono(result, n.pts = control$mono.points) :
  Detection function is not strictly monotonic!
Warning in check.mono(result, n.pts = control$mono.points) :
  Detection function is not strictly monotonic!
AIC= 10799.105
Warning in mrds::check.mono(model, n.pts = 20) :
  Detection function is not strictly monotonic!
> summary(amak.hn)

Summary for distance analysis 
Number of observations :  1243 
Distance range         :  0  -  82.5 

Model : Half-normal key function with cosine adjustment terms of order 2,3,4,5 

Strict monotonicity constraints were enforced.
AIC         :  10799.11 
Optimisation:  mrds (nlminb) 
> summary(amak.hn.R)

Summary for distance analysis 
Number of observations :  1243 
Distance range         :  0  -  82.5 

Model : Half-normal key function with cosine adjustment terms of order 2,3,4,5 

Strict monotonicity constraints were enforced.
AIC         :  10799.11 
Optimisation:  mrds (nlminb) 

Detection function parameters
Scale coefficient(s):  
            estimate         se
(Intercept) 3.566966 0.02196632

Adjustment term coefficient(s):  
                estimate         se
cos, order 2  0.22327721 0.05009273
cos, order 3 -0.15614305 0.04239069
cos, order 4  0.13641079 0.04225620
cos, order 5 -0.05127991 0.04008150

> summary(amak.hn.F)

Summary for distance analysis 
Number of observations :  1243 
Distance range         :  0  -  82.5 

Model : Half-normal key function with cosine adjustment terms of order 2,3,4,5 

Strict monotonicity constraints were enforced.
AIC         :  10799.02 
Optimisation:  MCDS.exe 

Detection function parameters
Scale coefficient(s):  
            estimate         se
(Intercept) 3.568482 0.02202903

Adjustment term coefficient(s):  
               estimate         se
cos, order 2  0.2276751 0.05006698
cos, order 3 -0.1638496 0.04219298
cos, order 4  0.1433463 0.04207448
cos, order 5 -0.0575266 0.03994616

Looks like in function ddf.ds it is always ending up at line 228 due to loop instigated inside run_MCDS which then calls ddf again and switches skipMCDS on (to avoid an infinite loop) but then that pings it to line 228 and means that MCDS can never be selected.

image
LHMarshall commented 1 year ago

@erex @lenthomas the lnl values being compared - one was +ve and the other -ve so MCDS.exe was never being chosen... turns out complicated reiteration stuff seems to be working. We now have a model where the MCDS.exe results were selected

image

Although I'm not sure about the strict monotonicity constraints comment being accurate?

image

wrt the number of adjustments terms I'm going to put in a restriction where by if the user is using MCDS they will be restricted to a maximum of 4 adjustments.

lenthomas commented 1 year ago

wrt adjustments, the maximum is not 4 -- it's that the maximum number of terms in the detection function (excluding covariate terms) is 5. So when you use a half-normal then that's a max of 4 adjustments. Here's the relevant page from the MCDS Engine help file: Screenshot 2023-03-21 060311

I guess we have two alternatives for default behaviour:

My feeling is that 5 adjustment terms is far too many, and the only reason they are there is a mis-reading of what mcds.exe does - it probably should have been a max of 5 parameters in the first place. @erex do you have an opinion?

When the user explicitly requests lots of parameters we should allow it however.

lenthomas commented 1 year ago

One though as an update - since we are not doing any model selection in mcds.exe then putting MAXTERMS = 7 into the mcds.exe command language should fix the problem as ds will never request more than 7 parameters (hazard rate plus 5 adjustments). (I do have my doubts that would actually converge however, but at least it's a temporary fix to the problem.)

erex commented 1 year ago

No opinion about maximum number of adjustments. Allowing lots of adjustments is asking for

lenthomas commented 1 year ago

Yes, I agree. Perhaps in the short term, set MAXTERMS = 7 in the mcds.exe command file, but longer term we should look at adjusting the behaviour of ds. I can add the latter as an issue if you like @LHMarshall ?

LHMarshall commented 1 year ago

testthat checks rather noisy

mrds

image

Distance (after merging uniform branch)

image

Looks like there has been a change in dplyr and I need to update summarise to reframe (but it doesn't do quite the same thing...)

erex commented 1 year ago

trying to run my "test cases (amakihi, ETP dolphins and duikers)" with the most recent (as of 03Apr23) branches of Distance and mrds, amakihi runs to completion, but ETP dolphins and duikers still fall over when mcds is explicitly called.

MRE below: appears cue type fitting fails for MCDS, so R result is returned

library(Distance)
data(ETP_dolphin)
bino <- subset(ETP_Dolphin, Search.method<3)
etp.hr.cue <- ds(bino, key="hr", formula=~factor(Cue.type))
Model contains covariate term(s): no adjustment terms will be included.
Fitting hazard-rate key function
Error in setinitial.ds(ddfobj, width = meta.data$width, initial, point,  : 
  Length of initial values for scale incorrect
In addition: Warning message:
In system(paste0(path.to.MCDS.dot.exe, " 0, ", test.file$command.file.name),  :
  running command 'C:/Users/erexs/Documents/R/win-library/4.1/mrds/MCDS.exe 0, C:\Users\erexs\AppData\Local\Temp\RtmpIZrKXo\cmdtmp27f07427695b.txt' had status 2
AIC= 2778.668

etp.hr.cue <- ds(bino, key="hr", formula=~factor(Cue.type), optimizer="MCDS")
Model contains covariate term(s): no adjustment terms will be included.
Fitting hazard-rate key function
Error in setinitial.ds(ddfobj, width = meta.data$width, initial, point,  : 
  Length of initial values for scale incorrect
In addition: Warning message:
In system(paste0(path.to.MCDS.dot.exe, " 0, ", test.file$command.file.name),  :
  running command 'C:/Users/erexs/Documents/R/win-library/4.1/mrds/MCDS.exe 0, C:\Users\erexs\AppData\Local\Temp\RtmpIZrKXo\cmdtmp27f0178177c3.txt' had status 2
Error in if (ddfobj$type == "hr" && lt$par[1] < sqrt(.Machine$double.eps)) { : 
  missing value where TRUE/FALSE needed
All models failed to fit!

Error in ds(bino, key = "hr", formula = ~factor(Cue.type), optimizer = "MCDS") : 
  No models could be fitted.

Same behaviour with ETP_dolphin when using factor(Month) as a covariate. However, using factor(Beauf.class) causes no problems.


duikers

Duiker data (provided with Distance package) succeeds when half-normal with adjustment order 2 is fitted using default for optimizer. But fails when MCDS is called explicitly as optimizer.

data("DuikerCameraTraps")
conversion <- convert_units("meter", NULL, "square kilometer")
trunc.list <- list(left=2, right=15)
mybreaks <- c(seq(2,8,1), 10, 12, 15)
hn2 <- ds(DuikerCameraTraps, transect = "point", key="hn", adjustment = "herm",
          nadj=2,
          cutpoints = mybreaks, truncation = trunc.list, convert_units = conversion)
Fitting half-normal key function with Hermite(4,6) adjustments
AIC= 24950.977   ## no problem
hn2.fort <- ds(DuikerCameraTraps, transect = "point", key="hn", adjustment = "herm",
          nadj=2, optimizer = "MCDS",
          cutpoints = mybreaks, truncation = trunc.list, convert_units = conversion)
Fitting half-normal key function with Hermite(4,6) adjustments
AIC= 25014.184
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x),  : 
  'data' must be of a vector type, was 'NULL'
Error in t(partial) %*% vcov : 
  requires numeric/complex matrix/vector arguments

Note however, that failure comes presumably after model fitting is complete (because AIC score is printed).