bluefoxr / COINr

COINr
https://bluefoxr.github.io/COINr/
Other
22 stars 7 forks source link

Normalise function not accepting "use_iMeta" argument when specifying normalization function for purse #58

Closed gdickens closed 1 month ago

gdickens commented 1 month ago

I've been trying to normalize a purse using the 'n_goalposts' method, however, the function doesn't appear to allow the use of goalposts from the iMeta data i.e. specifying 'use_iMeta' in the f_n_para argument.

What I've been trying to do is to apply a set of goalposts defined in the iMeta information of a purse, but no matter how I specify the argument R returns an error, such as 'argument "gposts" is missing, with no default' or 'f_n_para must be a list'.

The 'f_n_para must be a list' seems to be an error on my side based on having incorrectly formatted the function argument, but the gposts error suggests that the "use_iMeta" argument isn't working for the goalposts function.

Please see below for a reproducible example:

#load libraries ---- 
library(COINr)
library(dplyr)
library(tidyr)

#Create example table for custom iMeta arguments 
#calculate example goalposts info for function
sum_goalposts<-ASEM_iData_p |>
  pivot_longer(cols=-c(1:7), names_to="iCode") |>
  group_by(iCode) |>
  summarize(goalpost_lower=min(value, na.rm=TRUE),
            goalpost_upper=max(value, na.rm=TRUE) ,
            goalpost_scale=1,
            goalpost_trunc2posts=TRUE) |>
  mutate(goalpost_lower=goalpost_lower*0.95,
         goalpost_upper=goalpost_upper*1.05 ,
         minmax_lower=goalpost_lower,
         minmax_upper=goalpost_upper)

#join custom arguments to ASEM iMeta data
ASEM_iMeta<-left_join(ASEM_iMeta, sum_goalposts)

#create purse with ASEM panel data 
purse <- new_coin(iData = ASEM_iData_p,
                  iMeta = ASEM_iMeta,
                  split_to = "all",
                  quietly = TRUE)

#Demonstrate focus issue --- 
#Attempt to normalize purse using n_goalposts function, R returns: 'f_n_para must be a list' error
purse <- Normalise(purse, dset = "Raw",
                   global_specs = list(f_n = "n_goalposts", f_n_para = "use_iMeta"),
                   global = TRUE)

#Attempt to normalize purse using n_goalposts function, R returns error: "gposts" is missing, with no default
purse <- Normalise(purse, dset = "Raw",
                   global_specs = list(f_n = "n_goalposts"), list(f_n_para = "use_iMeta"),
                   global = TRUE)

#Attempt to normalize purse using default options (minmax), R returns no errors 
purse <- Normalise(purse, dset = "Raw",
                    global = TRUE,
                   write_to= "Normalised: minmax default")

#Attempt to normalize purse specifying n_minmax in the function arguments and 'use_iMeta', R returns no error
purse <- Normalise(purse, dset = "Raw",
                   global_specs = list(f_n = "n_minmax"), list(f_n_para = "use_iMeta"),
                   global = TRUE,
                   write_to= "Normalised: minmax goalposts")

#examine normalized data
tmp_normalised_default<-get_data(purse, dset="Normalised: minmax default")
tmp_normalised_goalposts<-get_data(purse, dset="Normalised: minmax goalposts")

#show summary statistics
summary(tmp_normalised_goalposts$NEET)
summary(tmp_normalised_default$NEET)
gdickens commented 1 month ago

In case it helps: what I've been trying to do is use COINr to normalize a set of indicators based on their distance to a pre-defined frontier based on performance. The formula I previously used is similar to the min / max approach except that the bounds aren't set by the indicator being normalized, but are externally set based on global data and/or research -

This means that I'm expecting each indicator would be normalized by its own goalpost (like the example). The normalized value of an indicator would also not necessarily span from 0 to 100 (or from 1 to 9 in my particular case).

The formula used by n_goalposts looked to be the appropriate way of achieving this in COINr, but I have had trouble using it on a panel dataset stored as a purse.

I've spent quite a lot of time testing different approaches with the help of ChatGPT / COINr's documentation, but apologies if I'm missing something obvious here.

bluefoxr commented 1 month ago

Hi, this might be because your COINr version is not new enough for this feature. The COINr version on CRAN is a bit out of date and I will submit the latest version to CRAN soon, but in the meantime try installing the dev version:

# Install development version from GitHub
devtools::install_github("bluefoxr/COINr")

Let me know if this works.

gdickens commented 1 month ago

Hey bluefoxr,

I did try this, albeit after an embarrassing amount of time using the CRAN version, but it didn't fix the problem.

To double check this I just removed the COINr package and reinstalled it via the command above and I'm receiving the same error(s) when running the code above.

bluefoxr commented 1 month ago

Hi @gdickens I had a closer look at this. The problem is that when you set global = TRUE, the purse method first extracts the data from all coins in the purse and glues it together, then passes the data frame to the data frame method of Normalise(). But the code that deals with iMeta parameters is contained within the coin method, which is bypassed in this case.

Ideally I would add the iMeta feature to the purse method as well but I'm not sure when I would be able to get around to that. So some workarounds that come to mind are:

Does that help a bit?

gdickens commented 1 month ago

Hey @bluefoxr

I gave this a try today and setting global=FALSE worked. Apologies for not trying this earlier, my reading of the documentation led me to believe this wasn't likely to be the cause.

I didn't try the custom operation option, but that looks really useful, so thanks for flagging it as I'm sure to need it in the future. But, I did have to write my own function so I could pass it to Aggregate() via the f_ag argument:

a_gmean_custom<-function(x, w)
{
    gm <- (10-exp(mean(log(((10-x)/10*9+1)), na.rm = TRUE)))/9*10

  gm
}

Worked like a charm, so thanks for making the package so flexible!

Thanks again for all your work on developing and maintaining the package. It's been super useful for my work.

(The geometric mean formula is based on EUROPA's approach for calculating the geometric mean when indicators might have a value of 0 (from their INFORM Risk Index). )

bluefoxr commented 1 month ago

Ok glad that works. I'll close the issue now.