DistanceDevelopment / MCDS_mrds_compare

1 stars 0 forks source link

The most well behaved of our data, `ducknest`, misbehaves with the most simple model #3

Open erex opened 1 year ago

erex commented 1 year ago

I noted a discrepancy in log-likelihoods (mrds vs mcds) for a half normal with no adjustments for the ducknest data; with MCDS producing an unreasonable result. I ran the model in question upon the nice data, turning on the debug argument, with these results

> duckhn0mcds <- ds(ducknest, key="hn", adjustment = "cos", nadj = 0, optimizer = "MCDS", debug_level = 3)
Fitting half-normal key function
DEBUG: initial values = -0.1866108 
Running MCDS.exe...
Command file written to C:\Users\erexs\AppData\Local\Temp\RtmpCuKGTy\cmdtmp1f9448d5673b.txt
Stats file written to C:\Users\erexs\AppData\Local\Temp\RtmpCuKGTy\stat1f944220499a.txt
DEBUG: initial values = 4.78516 
par =  4.78516 
nll =  467.4962 
par =  4.78516 
nll =  467.4962 
par =  4.78516 
nll =  467.4962 
par =  5.263676 
nll =  467.4987 
par =  4.306644 
nll =  467.4897 
par =  5.024418 
nll =  467.4978 
par =  4.545902 
nll =  467.4938 
par =  4.904789 
nll =  467.4971 
par =  4.665531 
nll =  467.4951 
par =  4.844974 
nll =  467.4967 
par =  4.725345 
nll =  467.4957 

DEBUG: Convergence! 
       Iteration  0.0 
       Converge   = 0 
       nll        = 467.4962 
       parameters = 4.7851599 
MCDS.exe log likehood: -467.4962
MCDS.exe pars: 119.7205
mrds refitted log likehood: -467.4962451
mrds refitted pars: 4.7851599

Convergence was presumably achieved, but I don't understand the change in initial value of sigma from -0.187 to 4.785. From that (erroneous) initial value, the MCDS optimiser gets stuck around a local minima and produces an absurd estimate of sigma 4.785 (on a log scale), resulting in a P_a = 0.9999999.

Why should MCDS go so badly wrong on a simple model with lovely data?

As a comparison, I checked what happens when a single adjustment term for the same key is fitted to the same data

> duckhn1mcds <- ds(ducknest, key="hn", adjustment = "cos", nadj = 1, optimizer = "MCDS", debug_level = 3)
Fitting half-normal key function with cosine(2) adjustments
DEBUG: initial values = -0.1866108 0 
Running MCDS.exe...
Command file written to C:\Users\erexs\AppData\Local\Temp\RtmpCuKGTy\cmdtmp1f94c0c32e2.txt
Stats file written to C:\Users\erexs\AppData\Local\Temp\RtmpCuKGTy\stat1f9421d679e6.txt
DEBUG: initial values = 0.9377312 -0.0219512

The initial values are not adjusted to such an extreme measure for this model and the resulting estimated parameters are reasonable.

lenthomas commented 1 year ago

I think that initial values are not passed out to 'mcds.exe' at present, so far as I can tell looking at the command file associated with the above code, when I run it (note that command files are not currently being deleted after runs):

C:\Users\lt5\AppData\Local\Temp\Rtmp4QzAGk\out61f81d82450d.txt 
C:\Users\lt5\AppData\Local\Temp\Rtmp4QzAGk\log61f869bc2d47.txt 
C:\Users\lt5\AppData\Local\Temp\Rtmp4QzAGk\stat61f81172b82.txt 
C:\Users\lt5\AppData\Local\Temp\Rtmp4QzAGk\plot61f84e7e2bc9.txt 
None 
None 
OPTIONS; 
DISTANCE=PERP /UNITS='Meters' /WIDTH=2.4; 
TYPE=LINE; 
OBJECT=SINGLE; 
PRINT=ALL; 
SELECTION=SPECIFY; 
END; 
DATA /STRUCTURE=FLAT; 
FIELDS=SMP_LABEL,SMP_EFFORT,DISTANCE,STR_LABEL; 
INFILE=C:\Users\lt5\AppData\Local\Temp\Rtmp4QzAGk\data61f81d0e6ea5.txt /NOECHO; 
END; 
ESTIMATE; 
DETECTION ALL; 
ESTIMATOR /KEY=HNORMAL; 
MONOTONE=NONE; 
DISTANCE /WIDTH=2.4 /LEFT=0; 
END;

Also when I look at the mcds_tools.R code in the mcds-dot-exe branch of mrds there are lots of commented-out lines aroudn the initial values part of the code, so I suspect it's temporarily disabled.

I don't know why mcds.exe goes so badly wrong when you don't pass in start values in this case however.

Anyway, I'll leave @LHMarshall to confirm/deny!

erex commented 1 year ago

Running the ducknest R dataset (with exact distances, data set that generated this issue) through DistWin, requesting HN key with 0 cos adjustments:

 Distance;                                                                     
 Density=All;                                                                  
 Encounter=All;                                                                
 Detection=All;                                                                
 Size=All;                                                                     
 Estimator /Key=HN /Adjust=CO /NAP=0 /Chat=1;                                  
 Monotone=Strict;  

produces expected behaviour, with a reasonable estimate of sigma

 Effort        :    2575.000    
 # samples     :    20
 Width         :    2.400000    
 # observations:   534

 Model
    Half-normal key, k(y) = Exp(-y**2/(2*A(1)**2))
       A( 1) bounds = (0.24000E-01 , 0.10000E+07 )
       Results:
       Convergence was achieved with   28 function evaluations.
       Final Ln(likelihood) value =  -463.06838    
       Akaike information criterion =   928.13678    
       Bayesian information criterion =   932.41718    
       AICc =   928.14429    
       QAIC =   930.13678     using c-hat   1.0000000    
       Final parameter values:   2.5418691    

This is a dramatically different result that requesting the same model fit to the same data, but using R to call the (same) MCDS.exe optimiser.

DistWin project below

duckexact.zip

LHMarshall commented 1 year ago

@erex @lenthomas The discrepancy is down to monotonicity constraints. In the Distance for windows analysis by default monotonicity is strictly decreasing, if you change it to none you get the odd result as above.

image image

So I tried setting the Distance analysis to strictly monotonic BUT Distance decides that because it is a key function only model it is fine to override that and set monotonicity back to NONE... hence the following doesn't work either

duckhn0mcds <- ds(ducknest, key="hn", 
                  adjustment = "cos", 
                  nadj = 0, 
                  optimizer = "MCDS", 
                  monotonicity = "strict",
                  debug_level = 3)

However, as it is Distance that decides that and not mrds if you bypass Distance and use mrds directly then you can replicate the results from Distance for Windows.

duckhn0MCDS <- ddf(dsmodel = ~cds(key = "hn", formula=~1),
                   data = ducknest,
                   meta.data = list(width = 2.4, mono.strict = TRUE),
                   control = list(optimiser = "MCDS"))

summary(duckhn0mcds)

Summary for ds object
Number of observations :  534 
Distance range         :  0  -  2.4 
AIC                    :  928.1338 
Optimisation           :  mrds (nlminb) 

Detection function:
 Half-normal key function 

Detection function parameters 
Scale coefficient(s): 
             estimate        se
(Intercept) 0.9328967 0.1703933

                       Estimate          SE         CV
Average p             0.8693482  0.03902051 0.04488479
N in covered region 614.2533225 29.19681554 0.04753221

Thoughts on what should be done about this?

lenthomas commented 1 year ago

This is indeed strange. I don't think it needs addressed in the up-coming release because the Distance fit will get selected. But it is strange that mcds.exe fits such a bad function here with monotonicity off.

LHMarshall commented 1 year ago

@erex looking at the project you just sent me jogged my memory... this anomaly had already been investigated as detailed above and the decision was not to do anything for this up coming release.

erex commented 1 year ago

OK Laura. Thanks for sending me to this issue. Should this issue be kept for future reference?

LHMarshall commented 1 year ago

I have left it open here. I am unsure where the issue is to be resolved however so I'm not sure if we should be opening another bug report somewhere else.