Closed tomaszaba closed 2 months ago
Hi @ernestguevarra,
I have addressed this using the base::norm()
:
I modified the helper function that was meant to normalise zscore into the following:
normalize_prevalence <- function(x, .status = c("gam", "sam")) {
.status <- match.arg(.status)
mean_x <- mean(remove_flags(x, "zscore"), na.rm = TRUE)
## Return GAM and SAM prevalence with a SD = 1
switch(
.status,
"gam" = {pnorm(q = -2, mean = mean_x, sd = 1, lower.tail = TRUE, log.p = FALSE)},
"sam" = {pnorm(q = -3, mean = mean_x, sd = 1, lower.tail = TRUE, log.p = FALSE)}
)
}
Then I use in the main function, the in the code block for when SD is problematic
compute_wfhz_prevalence <- function(df,
.wt = NULL,
.edema = NULL,
.summary_by) {
## Get and classify standard deviation ----
x <- df[["wfhz"]]
std <- classify_sd(sd(remove_flags(x, "zscore"), na.rm = TRUE))
if (std != "Problematic") {
### Compute observed prevalence ----
p <- df |>
get_wfhz_prevalence_estimates(
.wt = {{ .wt }},
.edema = {{ .edema }},
.summary_by = {{ .summary_by }}
)
} else {
### Compute prevalence with a standard deviation of 1 ----
p <- dplyr::tibble(
gam = normalize_prevalence(x, .status = "gam"),
sam = normalize_prevalence(x, .status = "sam"),
mam = gam - sam
)
}
p
}
This leads to the same results as in ENA when such conditions are met.
The only pending issue on this and all other prevalence functions is about the issue I opened few days ago where I think I need to write a for loop to iterate over an intermediate data frame of multiple survey areas and allow the prevalence to be computed on a case-by-case scenario.
what do you think of the above approach?
Hi Ernest,
I have a question about how to approach the calculation of prevalence using the Probit approach. In SMART, when a standard deviation is out of range 0.8-1.2 (problematic), the observed prevalence is ignored and a new one calculated assuming a standard deviation of 1.0. I initially thought this could be achieve by applying this formula: zscore_value - the mean of zscore variable / standard deviation of the zscore variable which will result in a normalised zscore with mean = 0 and standard deviation of 1.0. So I wrote this function:
which I then use as a helper function in the this other bigger/main function:
but the result I get is different to what ENA for SMART retrieves for when standard deviation is problematic. The rest of the code blocks works well.
So then I realised that they use the Probit approach. I wonder how should I approach this in
ipccheckr
?I also wish to add Green's Index statistics (GI and p) to the summarized table return by the above function. As I find it difficult to work with nipnTK::greensindex() I wrote to Mark today, as you had advised once in April.
Thanks.