grinnellm / SpawnIndex

:fish: :egg: Calculate the Pacific Herring spawn index
MIT License
0 stars 2 forks source link

Document and test utility functions (e.g., `mean_na`) #53

Open grinnellm opened 2 years ago

grinnellm commented 2 years ago

Write documentation and proper tests for utility functions. For all, complete tests including proper class for input and output. Also, use ... to pass say, omit_na to the mean() function in mean_na(x, ...) as described here. Note that this will require changing the functions where they are used because the default for na.rm in mean() is FALSE but the default for omit_na in mean_na() is TRUE.

These were copied from the "pbs-assess/gfiscamutils" repo because they are on the "herring" branch but not the main branch; was causing an issue with installing.

grinnellm commented 2 years ago

Include some motivation for these functions in the documentation. They're the result of some unexpected (to me) behaviour using sum() in the "dplyr" summarise functions:

require(dplyr)

dat <- tibble( Location = rep(c("A", "B", "C"), each = 3), Length = c(1:5, rep(NA, times = 4)) )

res <- dat %>% group_by(Location) %>% summarise( Max = max(Length, na.rm = TRUE), MaxNA = max_na(Length), Sum = sum(Length, na.rm = TRUE), SumNA = sum_na(Length), Mean = mean(Length, na.rm = TRUE), MeanNA = mean_na(Length), ) %>% ungroup()

The mean is as I expected, but the sum caught me by surprise. I expected the Sum for Location "C" to be NA instead of 0. It turns out this is the correct behaviour for the sum:

sum(NA, na.rm = TRUE)

should give 0, not NA (or NaN) because the sum of an empty set is zero.