Closed dill closed 8 years ago
I'll take the blame/flame for the first item in this issue; simulated dataset about which I cared very little for unit continuity.
the Montrave line transect data (Saturday adventure) was collected by Prof Buckland and follows a much more sane units convention: study area size in ha
, transect lengths in km
and perpendicular distances in m
. There is the wee curve ball of two transits of the transects (so there's a multiplier).
858cbac includes support for unit conversion and repeat visits (though see also #18).
These results are now not perfect, but significantly closer than they were. I think there may be issues getting uncertainty estimates to agree as I think the methods are different, but this may also be caused by the degrees of freedom (I think this will currently be incorrect when there are repeats in the data).
For example for the whale simulations:
cc <- convert_project("inst/CovarWhaleSim-solutions/CovarWhaleSim-solutions")
test_stats(cc[[2]])
Statistic Distance_value mrds_value Rel_diff Pass
1 n 60.0000000 60.0000000 0.000000000 ✓
2 parameters 1.0000000 1.0000000 0.000000000 ✓
3 AIC 123.2824020 123.2821247 0.000000000 ✓
4 Chi^2 p 0.7460822 0.8151965 0.092636280
5 P_a 0.4956547 0.4956543 0.000000000 ✓
6 CV(P_a) 0.0938000 0.0937935 0.000000000 ✓
7 log-likelihood -60.6412010 -60.6410623 0.000000000 ✓
8 K-S p 0.7095534 0.7095513 0.000000000 ✓
9 C-vM p 0.8000000 0.7631826 0.046021780
10 density 0.0346334 0.0346334 0.000000000 ✓
11 CV(density) 0.1400000 0.1399674 0.000232937
12 density lcl 0.0259316 0.0259316 0.000000000 ✓
13 density ucl 0.0462554 0.0462554 0.000000000 ✓
14 density df 21.3759995 21.3758021 0.000000000 ✓
15 individuals 346.0000000 346.3343596 0.000000000 ✓
16 CV(individuals) 0.1400000 0.1399674 0.000232937
17 individuals lcl 259.0000000 259.3158988 0.000000000 ✓
18 individuals ucl 463.0000000 462.5535465 0.000000000 ✓
19 individuals df 21.3759995 21.3758021 0.000000000 ✓
(though there are still issues when covariates are included).
This is by no means done, but estimates are much much closer now.
Looking good. The CvM P-value will always disagree because DisWin simply does a table lookup so P-values are only recorded to the nearest 0.1. I can't sort out how tolerance is calculated
tolerance = as.numeric(x[5]))
but I think it could be relaxes to the point where a P-value of 0.13996 is considered equivalent to 0.14000
Tolerances are stored in stats_table
, which I have changed to be 1e-1
and added an "Additional notes" section to the documentation for this function to hold such facts (see b4e8dba).
Do you know of any more such "facts" that might be useful?
Quantities stored in Distance can be in different (non-SI) units. For example for one project:
Indicating lines were measured in miles, perpendicular distances in kilometres and the region in square miles (:fire::fire::computer::fire::fire:).
readdst
should deal with this and be able to calculate abundances and densities appropriately.The
ProjectSettingsNumber
table in theDistIni.mdb
file has conversions. For example for linear units:and the following from the developer manual seems useful:
So during the
convert_project
stage, the units should be switched to SI.