haroine / icarus

A package with useful functions for calibration and reweighting in survey sampling
9 stars 5 forks source link

Calibration with raking method works with CALMAR2 SAS but not with Icarus in R #14

Open EmanueleCeglia opened 2 months ago

EmanueleCeglia commented 2 months ago

Hi, I need to perform a calibration using margins on a dataset of observations. Now this operation is performed using the macro CALMAR2 in SAS, I tried using the Icarus package but the algorithm doesn't converge. I checked all the input parameters and they are the same that I use in SAS, with the 'linear' method the algorithm converges but with the raking no.

khaledlarbi commented 2 months ago

Hi Emanuele,

Thank you very much for your message.

It may be difficult to address your issue without having more details (for instance, the number of calibration variables used, are they only numeric?).

The algorithm converges with both icarus and SAS (that's good news!). Are the weights from icarus and calmar the same?

Have a nice day.

Khaled

EmanueleCeglia commented 2 months ago

HI @khaledlarbi thanks for your reply. I try to explain better what I have to do. I have a dataset of firms (interviewed in a survey), for each company I use some categorical observations in order to perform the weight calibration:

Combining them together I create two new columns of observations: ctrysize, ctrysect.

i.e. ctrysize: AT1, AT2, AT3, AT4, BE1, BE2, ... , SK3, SK4 i.e. ctrysect: ITA, ITB, ITC, ITD, GRA, GRB, ... , SKC, SKD

In total there are 48 unique observations for ctrysize and 48 for ctrysect.

In total my dataset has 11.699 companies interviewed so these combinations are repated, and I have no missing combinations.

Then I have margins, for each unique combination I have the total number of firms (integer) in the Euro Area, so I have two margins columns, one with the totals for each of the 48 ctrysize combinations and one with the totals for each of the 48 ctrysect combinations.

The margins columns are in alphabetic order. What I do is to create a dataset with three columns: ctrysize, ctrysect, weight and 11.699 rows, in the ctrysize col I have categorical data, in ctrysect as well and weight is initialized with 1. I reorder the colmuns in alphabetic order and then I perform the calibration on this dataset using the two margins columns with Icarus.

calibration(data=dataset, marginMatrix=margins, colWeights="weight" , method="raking", scale=T, description=T, maxIter = 2500, calibTolerance = 0.0005)

But the algorithm doesn't converge with the raking method (I tried to modify also input parameters), with the linear converges but I need to perform the raking, because I have to migrate from SAS (where we use the macro Calmar2 raking method with the same dataset and it works) but I cannot understand why in R the algo doesn't converge.

Thanks for the support!!, Emanuele

EmanueleCeglia commented 2 months ago

Margins are properly created, here a pictures of how the margin dataset looks like image

khaledlarbi commented 2 months ago

So, if I understood correctly: the algorithm converged in both calmar and icarus with the linear method. Did you compare the weights obtained from both methods? Are they the same?

Can you check that the weights you got from the linear method with icarus fulfill the margin constraints? For instance, if we consider the combination AT1, you need to ensure that the sum of the calibrated weights for units in country AT and size 1 is equal to the margin you provided in the margins matrix. I suggest checking this for all combinations.

Does "I have no missing combinations" mean that each combination has at least one unit?

Can you provide the code you used to run the calibration with calmar ?