Error in if (nearest[2] == nearest[1])

bhive01 commented 4 years ago

I'm trying to compute metrics for a very simple case and failing. I have small groups with only 6 seeds, but I have 8 of these. I want to compute germination metrics for these small groups and then compute measurements of variation on these small groups.

library(germinationmetrics)
# this works
dat2 <- data.frame(a = 1, b = 2, c = 3, d = 4, total = 6)
germination.indices(data = dat2, total.seeds.col = "total", counts.intervals.cols = c("a", "b", "c", "d"), intervals = 1:4, partial = FALSE) 

# this also works
dat3 <- data.frame(a = 0, b = 1, c = 1, d = 3, total = 6)
germination.indices(data = dat3, total.seeds.col = "total", counts.intervals.cols = c("a", "b", "c", "d"), intervals = 1:4, partial = TRUE) 

# this fails
dat4 <- data.frame(a = 4, b = 4, c = 4, d = 4, total = 6)
germination.indices(data = dat4, total.seeds.col = "total", counts.intervals.cols = c("a", "b", "c", "d"), intervals = 1:4, partial = FALSE) 

Error in if (nearest[2] == nearest[1]) { : 
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In germination.indices(data = dat4, total.seeds.col = "total", counts.intervals.cols = c("a",  :
  For the following rows in "data", the total number of seeds tested ("total.seeds.col") is less than the total number of germinated seeds: 1
2: In max(csx[csx <= xhalf]) :
  no non-missing arguments to max; returning -Inf

# If I turn off some of the outputs it does give me an answer
germination.indices(data = dat4, total.seeds.col = "total", counts.intervals.cols = c("a", "b", "c", "d"), intervals = 1:4, partial = FALSE, t50 = FALSE, GermRateRecip = FALSE, TimsonsIndex = FALSE, GermRateGeorge = FALSE)

The problem seems to occur when any of these four variables are set to TRUE:

t50
TimsonsIndex
GermRateGeorge
GermRateRecip

I realize the structure of my data is a bit weird, but it seemed like the best way to get statistical power with limited seeds.

bhive01 commented 4 years ago

Some further comments:

# succeeds
dat5 <- data.frame(a = 1, b = 4, c = 4, d = 4, total = 6)
germination.indices(data = dat5, total.seeds.col = "total", counts.intervals.cols = c("a", "b", "c", "d"), intervals = 1:4, partial = FALSE)
# succeeds
dat6 <- data.frame(a = 2, b = 4, c = 4, d = 4, total = 6)
germination.indices(data = dat6, total.seeds.col = "total", counts.intervals.cols = c("a", "b", "c", "d"), intervals = 1:4, partial = FALSE)
# fails
dat7 <- data.frame(a = 3, b = 4, c = 4, d = 4, total = 6)
germination.indices(data = dat7, total.seeds.col = "total", counts.intervals.cols = c("a", "b", "c", "d"), intervals = 1:4, partial = FALSE)

We've added a day before our initial measurements with 0 scores and everything seems to process. I'm guessing this is required for the t50 calculations. Perhaps the error message could be made more clear?

aravind-j commented 4 years ago

The issue seems to be with t50 and GermRateRecip. GermRateRecip is just the reciprocal of t50.

dat4 <- data.frame(a = 4, b = 4, c = 4, d = 4, total = 6)
germination.indices(data = dat4, total.seeds.col = "total",
                    counts.intervals.cols = c("a", "b", "c", "d"),
                    intervals = 1:4, partial = FALSE)
#> Warning in germination.indices(data = dat4, total.seeds.col = "total",
#> counts.intervals.cols = c("a", : For the following rows in "data", the total
#> number of seeds tested ("total.seeds.col") is less than the total number of
#> germinated seeds: 1
#> Warning in max(csx[csx <= xhalf]): no non-missing arguments to max; returning -
#> Inf
#> Error in if (nearest[2] == nearest[1]) {: missing value where TRUE/FALSE needed

germination.indices(data = dat4, total.seeds.col = "total",
                    counts.intervals.cols = c("a", "b", "c", "d"),
                    intervals = 1:4, partial = FALSE, t50 = FALSE,
                    GermRateRecip = FALSE)
#> Warning in germination.indices(data = dat4, total.seeds.col = "total",
#> counts.intervals.cols = c("a", : For the following rows in "data", the total
#> number of seeds tested ("total.seeds.col") is less than the total number of
#> germinated seeds: 1
#>   a b c d total GermPercent FirstGermTime LastGermTime PeakGermTime
#> 1 4 0 0 0     6    66.66667             1            1            1
#>   TimeSpreadGerm MeanGermTime VarGermTime SEGermTime CVGermTime MeanGermRate
#> 1              0            1           0          0          0            1
#>   VarGermRate SEGermRate CVG GermSpeed_Count GermSpeed_Percent
#> 1           0          0 100               4          66.66667
#>   GermSpeedAccumulated_Count GermSpeedAccumulated_Percent
#> 1                   8.333333                     138.8889
#>   GermSpeedCorrected_Normal GermSpeedCorrected_Accumulated WeightGermPercent
#> 1                      0.06                          0.125          66.66667
#>   MeanGermPercent MeanGermNumber TimsonsIndex TimsonsIndex_Labouriau
#> 1        16.66667              1     266.6667                      4
#>   TimsonsIndex_KhanUngar GermRateGeorge PeakValue GermValue_Czabator
#> 1               66.66667             16  66.66667           1111.111
#>   GermValue_DP GermValue_Czabator_mod GermValue_DP_mod CUGerm GermSynchrony
#> 1     444.4444               1111.111         444.4444    Inf             1
#>   GermUncertainty
#> 1               0

It can be seen in the original functions.

x <- c(4,4,4,4)
int <- 1:length(x)
tot <- 6

t50(germ.counts = x, intervals = int, partial = FALSE, method = "coolbear")
#> Warning in max(csx[csx <= xhalf]): no non-missing arguments to max; returning -
#> Inf
#> Error in if (nearest[2] == nearest[1]) {: missing value where TRUE/FALSE needed
t50(germ.counts = x, intervals = int, partial = FALSE, method = "farooq")
#> Warning in max(csx[csx <= xhalf]): no non-missing arguments to max; returning -
#> Inf
#> Error in if (nearest[2] == nearest[1]) {: missing value where TRUE/FALSE needed
GermRateRecip(germ.counts = x, intervals = int,
              method = "coolbear", partial = FALSE)
#> Warning in max(csx[csx <= xhalf]): no non-missing arguments to max; returning -
#> Inf
#> Error in if (nearest[2] == nearest[1]) {: missing value where TRUE/FALSE needed
GermRateRecip(germ.counts = x, intervals = int,
              method = "farooq", partial = FALSE)
#> Warning in max(csx[csx <= xhalf]): no non-missing arguments to max; returning -
#> Inf
#> Error in if (nearest[2] == nearest[1]) {: missing value where TRUE/FALSE needed

It disappears by adding an interval with 0 count before initial measurement.


x <- c(0,4,4,4,4)
int <- 1:length(x)
tot <- 6

t50(germ.counts = x, intervals = int, partial = FALSE, method = "coolbear")
#> [1] 1.625
t50(germ.counts = x, intervals = int, partial = FALSE, method = "farooq")
#> [1] 1.5
GermRateRecip(germ.counts = x, intervals = int,
              method = "coolbear", partial = FALSE)
#> [1] 0.6153846
GermRateRecip(germ.counts = x, intervals = int,
              method = "farooq", partial = FALSE)
#> [1] 0.6666667

aravind-j commented 4 years ago

The warning For the following rows in "data", the total number of seeds tested ("total.seeds.col") is less than the total number of germinated seeds" is due to an incorrect check which throws an error when the data is cumulative.

t50and in turn GermRateRecip is failing, because due to rapid germination, on the first interval itself, more than 50% seeds are germinating. I will update it with an appropriate warning.

aravind-j / germinationmetrics

Error in if (nearest[2] == nearest[1]) #4