haroine / icarus

A package with useful functions for calibration and reweighting in survey sampling
9 stars 5 forks source link

Add an argument to allow consistent comparaison with results from SAS (Calmar) #13

Open khaledlarbi opened 5 months ago

khaledlarbi commented 5 months ago

Hi @haroine ,

I hope you're well.

I propose a small pull request to fix two small issues:

Here's a small example:

library(icarus)

N <- 230 ## Population size
## Compute the Horvitz-Thompson estimator (returns 1.666667)
weightedMean(data_employees$movies, data_employees$weight, N)

## Add calibration margins
mar <- c("salary", 0, 0)
margins <- rbind(mar)
## Compute calibration weights
wCal <- calibration(data=data_employees, marginMatrix=margins, colWeights="weight"
                    , method="raking", description=FALSE)

I propose to add a tolDefinition argument to the icarus::calibration function (and also icarus::calib, icarus::calibAlgorithm) which allows you to choose the stopping criterion. For the moment, there are two possible choices: sas which is compatible with the SAS CALMAR 2 definition and default which is the initial choice in Icarus (and the default choice).

I added unit tests (see 4d43d98) based on a simulated dataset and calibrated weights computed both in SAS CALMAR 2 and icarus : for some tolerance threshold, weights are not the same if one does not use tolDefinition = "sas".

I have a little issue with the folder data in the package : since I added several datasets to run unit tests, the folder became heavier (around 6 Mb) and I got a warning and a note while using devtools::check. I tried several compression option but I still get this issue. A fix can be to put to False : LazyData in DESCRIPTION