BiologicalRecordsCentre / BRCindicators

An R package for creating indicators from trends data
4 stars 11 forks source link

MSI subscript out of bounds error #27

Closed larspett closed 5 years ago

larspett commented 5 years ago

When trying out BRCindicators today, I ran into trouble with the multispecies indicators. Using the vignett information and generating the MSI data tutorial file, I recieve the error message:

Error in [<-(*tmp*, o, s, value = rnorm(1, LNindex[o, 1], LNse[o, : subscript out of bounds In addition: Warning message: In rnorm(1, LNindex[o, 1], LNse[o, 1]) : NAs produced

The same response is received when using real indicator data (the Swedish Grassland Butterfly Indicator 2010-2017).

Sys.info() sysname release version "Windows" "7 x64" "build 7601, Service Pack 1" nodename machine login "SPEEDBURK" "x86-64" "Lars" user effective_user "Lars" "Lars"

session_info()

[1] C:/Users/Lars/Documents/R/win-library/3.5 [2] C:/Program Files/R/R-3.5.1/library

larspett commented 5 years ago

Tried it on my mac now with identical result /Lars

msi_out <- msi(data, plot =FALSE) Error in [<-(*tmp*, o, s, value = rnorm(1, LNindex[o, 1], LNse[o, : subscript out of bounds In addition: Warning message: In rnorm(1, LNindex[o, 1], LNse[o, 1]) : NAs produced

─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────── setting value
version R version 3.5.1 (2018-07-02) os macOS 10.14.3
system x86_64, darwin15.6.0
ui RStudio
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/Stockholm
date 2019-03-03

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────── package version date lib source
abind 1.4-5 2016-07-21 [1] CRAN (R 3.5.0)
assertthat 0.2.0 2017-04-11 [1] CRAN (R 3.5.0)
backports 1.1.3 2018-12-14 [1] CRAN (R 3.5.0)
boot 1.3-20 2017-08-06 [1] CRAN (R 3.5.1)
BRCindicators
1.2 2019-03-03 [1] Github (biologicalrecordscentre/BRCindicators@2aae6cc) callr 3.1.1 2018-12-21 [1] CRAN (R 3.5.0)
car 3.0-2 2018-08-23 [1] CRAN (R 3.5.0)
carData 3.0-2 2018-09-30 [1] CRAN (R 3.5.0)
cellranger 1.1.0 2016-07-27 [1] CRAN (R 3.5.0)
cli 1.0.1 2018-09-25 [1] CRAN (R 3.5.0)
coda 0.19-2 2018-10-08 [1] CRAN (R 3.5.0)
colorspace 1.4-0 2019-01-13 [1] CRAN (R 3.5.2)
crayon 1.3.4 2017-09-16 [1] CRAN (R 3.5.0)
curl 3.3 2019-01-10 [1] CRAN (R 3.5.2)
data.table 1.12.0 2019-01-13 [1] CRAN (R 3.5.2)
desc 1.2.0 2018-05-01 [1] CRAN (R 3.5.0)
devtools 2.0.1 2018-10-26 [1] CRAN (R 3.5.1)
digest 0.6.18 2018-10-10 [1] CRAN (R 3.5.0)
dplyr 0.8.0.1 2019-02-15 [1] CRAN (R 3.5.2)
forcats 0.4.0 2019-02-17 [1] CRAN (R 3.5.2)
foreign 0.8-71 2018-07-20 [1] CRAN (R 3.5.0)
fs 1.2.6 2018-08-23 [1] CRAN (R 3.5.0)
ggplot2 3.1.0 2018-10-25 [1] CRAN (R 3.5.0)
glue 1.3.0 2018-07-17 [1] CRAN (R 3.5.0)
gtable 0.2.0 2016-02-26 [1] CRAN (R 3.5.0)
haven 2.1.0 2019-02-19 [1] CRAN (R 3.5.2)
hms 0.4.2 2018-03-10 [1] CRAN (R 3.5.0)
jagsUI 1.5.0 2018-09-12 [1] CRAN (R 3.5.0)
lattice 0.20-38 2018-11-04 [1] CRAN (R 3.5.0)
lazyeval 0.2.1 2017-10-29 [1] CRAN (R 3.5.0)
magrittr 1.5 2014-11-22 [1] CRAN (R 3.5.0)
Matrix 1.2-15 2018-11-01 [1] CRAN (R 3.5.0)
memoise 1.1.0 2017-04-21 [1] CRAN (R 3.5.0)
mgcv 1.8-27 2019-02-06 [1] CRAN (R 3.5.2)
munsell 0.5.0 2018-06-12 [1] CRAN (R 3.5.0)
nlme 3.1-137 2018-04-07 [1] CRAN (R 3.5.1)
openxlsx 4.1.0 2018-05-26 [1] CRAN (R 3.5.0)
pillar 1.3.1 2018-12-15 [1] CRAN (R 3.5.0)
pkgbuild 1.0.2 2018-10-16 [1] CRAN (R 3.5.0)
pkgconfig 2.0.2 2018-08-16 [1] CRAN (R 3.5.0)
pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.5.0)
plyr 1.8.4 2016-06-08 [1] CRAN (R 3.5.0)
prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.5.0)
processx 3.2.1 2018-12-05 [1] CRAN (R 3.5.0)
ps 1.3.0 2018-12-21 [1] CRAN (R 3.5.0)
purrr 0.3.1 2019-03-03 [1] CRAN (R 3.5.1)
R6 2.4.0 2019-02-14 [1] CRAN (R 3.5.2)
RColorBrewer 1.1-2 2014-12-07 [1] CRAN (R 3.5.0)
Rcpp 1.0.0 2018-11-07 [1] CRAN (R 3.5.0)
readxl 1.3.0 2019-02-15 [1] CRAN (R 3.5.2)
remotes 2.0.2 2018-10-30 [1] CRAN (R 3.5.0)
reshape2 1.4.3 2017-12-11 [1] CRAN (R 3.5.0)
rio 0.5.16 2018-11-26 [1] CRAN (R 3.5.0)
rjags 4-8 2018-10-19 [1] CRAN (R 3.5.0)
rlang 0.3.1 2019-01-08 [1] CRAN (R 3.5.2)
rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.5.0)
rstudioapi 0.9.0 2019-01-09 [1] CRAN (R 3.5.2)
scales 1.0.0 2018-08-09 [1] CRAN (R 3.5.0)
sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.5.0)
stringi 1.3.1 2019-02-13 [1] CRAN (R 3.5.2)
stringr 1.4.0 2019-02-10 [1] CRAN (R 3.5.2)
testthat 2.0.1 2018-10-13 [1] CRAN (R 3.5.0)
tibble 2.0.1 2019-01-12 [1] CRAN (R 3.5.2)
tidyselect 0.2.5 2018-10-11 [1] CRAN (R 3.5.0)
usethis
1.4.0 2018-08-14 [1] CRAN (R 3.5.0)
withr 2.1.2 2018-03-15 [1] CRAN (R 3.5.0)
yaml 2.2.0 2018-07-25 [1] CRAN (R 3.5.0)
zip 2.0.0 2019-02-25 [1] CRAN (R 3.5.2)

[1] /Library/Frameworks/R.framework/Versions/3.5/Resources/library

MartinStjernman commented 5 years ago

While I haven't scrutinized the code to fully understand where the problem appears (and why), the error appears when you have indices as proportional change instead of percentage change or more specifically when the base year have index 1 instead of index 100. In the documentation for the original msi_tool (https://www.cbs.nl/en-gb/society/nature-and-environment/indices-and-trends--trim--/msi-tool) this is specified under point A "Create the input file". It is as far as I can see not mentioned that the index for the base year should be 100 in the BRCindicators vignette (paragraph Multi-species indicator) and since the example data generation uses runif(n, min=1, max=10) to create indices there will be no year with index 100.

It works fine for me if I f ex run (NOTE you need to multiply both the indices and their standard error with 100): data <- data.frame(species = rep(letters, each = 50), year = rep(1:50, length(letters)), index = runif(n = 50 length(letters), min = 0.1, max = 10)100, se = runif(n = 50 length(letters), min = 0.01, max = .8)100) msi_out <- msi(data)

AugustT commented 5 years ago

@larspett can you share the code you ran.

I run the example within the help file without error

library(BRCindicators)

data <- data.frame(species = rep(letters, each = 50),
                   year = rep(1:50, length(letters)),
                   index = runif(n = 50 * length(letters), min = 50, max = 150),
                   se = runif(n = 50 * length(letters), min = 0.1, max = 8))

# Run the MSI function
msi_out <- msi(data, plot = FALSE)

# Plot the resulting indicator
plot(msi_out)

I do get an error with the vignette code (which was the bug I alluded to in our email exchange) but it is not the same as yours.

data <- data.frame(species = rep(letters, each = 50),
                   year = rep(1:50, length(letters)),
                   index = runif(n = 50 * length(letters), min = 1, max = 10),
                   se = runif(n = 50 * length(letters), min = 0.01, max = .8))

# Alternativly I could read in data from a csv
# read.csv('path/to/my/data.csv')

# Run the MSI function
msi_out <- msi(data)
Error in plot.window(...) : need finite 'ylim' values
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf

I can confirm @MartinStjernman 's *100 fix work s for me too. I note that originally I have a section of code which rescaled all species to 100 in the first year, but this was removed, I'm trying to work out why.

AugustT commented 5 years ago

This re-scaling might have been removed because of the flexibility of the tool that does not require the 100 value to be in the first year.

The base year (with index 100) may differ between species and has standard error zero. If the indices are expressed with index 1 in the base year, all indices and standard errors should be multiplied by 100.

AugustT commented 5 years ago

Okay, here is my suggested fix:

MartinStjernman commented 5 years ago

I do get an error with the vignette code (which was the bug I alluded to in our email exchange) but it is not the same as yours.

data <- data.frame(species = rep(letters, each = 50),
                   year = rep(1:50, length(letters)),
                   index = runif(n = 50 * length(letters), min = 1, max = 10),
                   se = runif(n = 50 * length(letters), min = 0.01, max = .8))

# Alternativly I could read in data from a csv
# read.csv('path/to/my/data.csv')

# Run the MSI function
msi_out <- msi(data)
Error in plot.window(...) : need finite 'ylim' values
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf

I think you will get the same error as Lars if you run (like Lars did): msi_out <- msi(data, plot=FALSE)

At least I did. You get the plot error since apparently what is reported is the error from producing the plot in msi_tool and not the error occurring elsewhere in msi_tool and the plot error appears since there is nothing useful to be plotted.

MartinStjernman commented 5 years ago

Okay, here is my suggested fix:

  • [x] Add link to the MSI manual from the vignette
  • [x] Specify in the vignette that the index value in the base year should be 100 and the se should be 0 in this year
  • [x] Update the help documentation in msi and msi_tool to make the index 100 and se 0 conditions clear.
  • [x] Add a warning message to catch species without an index value of 100
  • [x] Add a warning message to catch species without an se value of 0
  • [ ] update the example in the vignette to conform to these needs

Seems OK. From the MSI manual: " If the indices are expressed with index 1 in the base year, all indices and standard errors should be multiplied by 100." If species indices are calculated using TRIM or rtrim (which I think is the starting point for Soldaat et al) then SE in the base year is "automatically" zero and the zero will be preserved by the "multiply by 100"-instruction from the MSI manual. But I guess that in cases where species indices have been calculated using other programs (in other ways) it might be necessary to set SE to zero in the base year (although I'm not sure WHY this is a necessity, perhaps it is explained in Soldaat et al or in the rtrim documentation).

AugustT commented 5 years ago

Yeah, I agree with all that. While I'm at it I'm going to change the code in msi_tool so that the CV values with species names are returned to R, as it stands the plot is pretty useless because the species names are coded to numbers.

AugustT commented 5 years ago

I believe this issue is now resolved

larspett commented 5 years ago

Thanks a lot, was off the grid today with aftershocks of a pretty bad flu. Will try this now