Closed superjeje closed 6 years ago
@superjeje the variable(s) you are trying to use as strataVars (SDC_stV
) are set to NULL
. so you are not using any strata-variables. you can easily check with get.sdcMicroObj(SDC, "strataVar")
which in your example also returns NULL
Thank you @bernhard-da
yes the the SDC_stV is set to null (SDC_stV = NULL; )
but
not the used parameter SDC_magg_stV
which is set with kV_1+kV_2 (SDC_magg_stV = c("kV_1","kV_2") line before "###---------------------------------CREATE THE SDC OBJECT FROM A DATAFRAME" )
SDC_LS_MAGG <- microaggregation(SDC_LS,variables = SDC_nV,strata_variables = SDC_magg_stV , method='mdav',measure='median', aggr=7);
thank you for your reply
hi @superjeje thx for spotting this. indeed, the argument is ignored in case you are using a sdcMicroObj as input in microaggregation()
. I just pushed a change that explicitely tells you that this argument is ignored. if you want to use stratification, just use strataVar(object) <- some_vars
before the call and set it to NULL
afterwards.
Thank you
Here i come again, i m not sure to understand clearly as the strata_variables is given as an parameter in a microaggregation example in the SDC book of Matthias templ ( p124)
thank you for your help.
best regards
yeah, this argument was just (silently) ignored if the input was a sdcMicroObj. In case you fed a data.frame
to microaggregation()
, this argument is of course used. as for the example in the book: if the sdc
-object has slot strataVar
set, this variable(s) were used, if not, no stratification would be applied even if specified. in the next version of sdcMicro, we will explicitly give the user a message about this behaviour.
Hello
when trying to microaggregate using the MDAV method and using the strata_variables , I can see no effect on the results wether with or without the strata_variables provided.
Is it a bug or where do i get wrong?
the version is : 5.3.0 executed on Windows 8 / Rstudio Version 1.1.442
here s the exact command line SDC_LS_MAGG <- microaggregation(SDC_LS,variables = SDC_nV,strata_variables = SDC_magg_stV , method='mdav',measure='median', aggr=7);
Here s the full script
---------------------------------INITIALIZE PARAMETERS
SDC_kV = c("kV_1","kV_2","kV_3","kV_4"); SDC_kV = gsub(" ","", SDC_kV); SDC_sV = NULL; SDC_nV =c("nV_1","nV_2"); SDC_nV = gsub(" ", "", SDC_nV); SDC_gV = NULL; SDC_stV = NULL; SDC_pV = NULL; SDC_wV = NULL; SDC_eV = NULL; SDC_magg_stV = c("kV_1","kV_2"); SDC_magg_stV = gsub(" ","", SDC_magg_stV );
---------------------------------CREATE THE SDC OBJECT FROM A DATAFRAME
SDC <- createSdcObj(DF_SDC,keyVars=SDC_kV,numVars = SDC_nV,excludeVars =SDC_eV,sensibleVar = SDC_sV,ghostVars = SDC_gV ,strataVar = SDC_stV,pramVars =SDC_pV ,weightVar = SDC_wV);
---------------------------------PRINT THE SDC OBJECT
SDC;
---------------------------------LOCAL SUPPRESSION K=7 AND IMPORTANCE SET
SDC_importance=c(1,2,3,4) SDC_LS <- localSuppression(SDC, k = 7,importance=SDC_importance);
---------------------------------PRINT THE SDC_LS OBJECT
print(SDC_LS, "ls");
---------------------------------MICROAGGREGATION AGG=7 MEASURE=MEDIAN METHOD=MDAV
SDC_LS_MAGG <- microaggregation(SDC_LS,variables = SDC_nV,strata_variables = SDC_magg_stV , method='mdav',measure='median', aggr=7);
---------------------------------WRITE A CSV FILE - READABLE FOR TESTING PURPOSE
datafile_out = "SDC_LS_MAGG.csv" path_datafile_out=paste(paste(getwd(),"SDC/microagg/work",sep='/'),datafile_out,sep='/') writeSafeFile(obj=SDC_LS_MAGG, format="csv", randomizeRecords="no", sep="^", dec=".", col.names=TRUE, row.names=FALSE, quote = FALSE, fileOut=path_datafile_out);
here the description of the initial datafile str(DF_SDC) 'data.frame': 330 obs. of 6 variables: $ kV_1: chr "C" "C" "C" "C" ... $ kV_2: chr "B" NA "F" NA ... $ kV_3: chr "E" "D" "B" "D" ... $ kV_4: chr "C" "H" "H" "G" ... $ nV_1: num 17 19 14 18 16 16 15 15 14 17 ... $ nV_2: num 0 34 0 33 0 35 0 0 39 0 ...
the input file (DF_SDC) is attached DF_SDC.zip the output file (SDC_LS_MAGG) is attached SDC_LS_MAGG.zip