statistikat / simPop

Simulation of Synthetic Populations for Survey Data Considering Auxiliary Information
30 stars 7 forks source link

Handling a single value of level variable in a psu in simRelation and allied functions. #12

Closed manab-prakash closed 3 years ago

manab-prakash commented 3 years ago

It is rather an opinion than an issue. I was trying to predict caste~age+sex for a living standard survey data. It happened that some of the PSUs have a single caste group. As a result, I got a rather abstruse error from the simRelation function. Error in multinom(caste ~ age + sex, weights = weights, data = dataSampleWork, : need two or more classes to fit a multinom model
(In hindsight, it is perfectly reasonable and explains the problem quite well) It happened that while the dataset had a lot of variance in caste, some PSUs were homogenous. It would be helpful if there was a warning of some sort. Like inserting something like this in simulateValues function after line number 37: dataSampleWork <- dataSample[indSampleHead, ]

if(length(unique(dataSampleWork[, eval(parse(text = cur.var))]))==1){cat("Error: single level in",x,"", curStrata, "of sample")}

Also, @bernhard-da is it a good idea to fill in values from the sample in population if the stratum has a homogenous value. Would it be possible to implement inside this package? In my case using the homogenous value of the caste to fill all households in that PSU.