American-Institutes-for-Research / WeMix

WeMix public repository
GNU General Public License v2.0
10 stars 2 forks source link

NA rows are omitted based on all variables, not just variables needed for model #2

Closed BernhardClemm closed 2 years ago

BernhardClemm commented 2 years ago

As expected, the mixcall applies an omission for missing values. However, it seems to apply this omission to observations that are missing on any variable in the provided data set, not just the variables relevant to the model.

For example, if these are the data:

dt <- structure(list(person_id = c(25L, 25L, 25L, 36L, 36L, 36L, 2L, 
2L, 2L, 27L, 27L, 27L, 47L, 47L, 47L, 15L, 15L, 15L, 11L, 11L, 
11L, 42L, 42L, 42L, 39L, 39L, 39L, 40L, 40L, 40L, 43L, 43L, 43L, 
21L, 21L, 21L, 16L, 16L, 16L, 5L, 5L, 5L, 46L, 46L, 46L, 13L, 
13L, 13L, 34L, 34L, 34L, 14L, 14L, 14L, 28L, 28L, 28L, 22L, 22L, 
22L, 35L, 35L, 35L, 19L, 19L, 19L, 41L, 41L, 41L, 24L, 24L, 24L, 
38L, 38L, 38L, 30L, 30L, 30L, 12L, 12L, 12L, 26L, 26L, 26L, 45L, 
45L, 45L, 33L, 33L, 33L, 10L, 10L, 10L, 49L, 49L, 49L, 23L, 23L, 
23L, 3L, 3L, 3L, 37L, 37L, 37L, 31L, 31L, 31L, 32L, 32L, 32L, 
29L, 29L, 29L, 17L, 17L, 17L, 8L, 8L, 8L, 18L, 18L, 18L, 4L, 
4L, 4L), ft_party_opp = c(95, 85, 98, 91, 98, 93, 89, 70, 56, 
73, 46, 40, 100, 100, 100, 99, 99, 99, 15, 55, 49, 70, 48, 85, 
79, 87, 55, 70, 91, 62, 91, 76, 95, 99, 99, 99, 92, 93, 77, 97, 
89, 94, 100, 100, 100, 73, 64, 62, 66, 76, NA, 83, 81, 41, 88, 
89, 85, 59, 70, 91, 64, 0, 22, 100, 100, 88, 99, 99, 99, 60, 
76, 53, 95, 95, 33, 86, 100, 99, 40, 75, 50, 60, 81, 81, 81, 
75, 60, 80, 49, 72, 96, 99, 52, 71, 49, 46, 89, 69, 80, 99, 98, 
96, 92, 58, 99, 95, 98, 100, 73, 44, 15, 85, 97, 80, 98, 97, 
95, 94, 81, 91, 51, 100, 34, NA, 65, 76), news_u_log = c(1.38629436111989, 
2.07944154167984, 1.79175946922805, 4.29045944114839, 3.40119738166216, 
7.09672137849476, 4.52178857704904, 5.12396397940326, 5.53733426701854, 
2.94443897916644, 5.14166355650266, 4.9416424226093, 5.19849703126583, 
5.65248918026865, 3.3322045101752, 0.693147180559945, 4.29045944114839, 
4.59511985013459, 2.70805020110221, 3.71357206670431, 3.09104245335832, 
0, 0, 3.09104245335832, NA, 4.78749174278205, 4.85981240436167, 
3.80666248977032, 4.72738781871234, 6.289715570909, 3.2188758248682, 
2.94443897916644, 2.484906649788, 5.66988092298052, 5.74939298590825, 
5.66642668811243, 2.83321334405622, 3.71357206670431, 2.56494935746154, 
4.53259949315326, 4.57471097850338, 4.77068462446567, 5.19295685089021, 
5.59842195899838, 5.2257466737132, 3.87120101090789, 4.43081679884331, 
2.77258872223978, 3.46573590279973, 1.09861228866811, 1.6094379124341, 
5.11799381241676, 5.36597601502185, 5.2040066870768, 4.96284463025991, 
7.60489448081162, 6.16331480403464, 6.04500531403601, 4.49980967033027, 
5.23110861685459, 2.99573227355399, 6.11589212548303, 4.27666611901606, 
NA, 2.63905732961526, 2.63905732961526, 7.00669522683704, 6.89871453432999, 
6.06842558824411, 1.38629436111989, 4.14313472639153, 4.52178857704904, 
5.17048399503815, 5.63835466933375, 4.98360662170834, 5.29330482472449, 
2.77258872223978, 5.25227342804663, 5.75890177387728, 7.32251043399739, 
8.59822003005861, 5.67675380226828, 5.04342511691925, 5.30826769740121, 
6.72503364216684, 7.36454701425564, 8.64979915596426, 2.56494935746154, 
2.07944154167984, 4.77912349311153, 8.12829017160705, 8.2358907259285, 
6.82328612235569, 2.63905732961526, 5.96100533962327, 1.79175946922805, 
3.61091791264422, 3.73766961828337, 3.8286413964891, 6.07764224334903, 
6.60934924316738, 6.54678541076052, 3.91202300542815, 3.89182029811063, 
3.66356164612965, 2.19722457733622, 3.09104245335832, 3.09104245335832, 
3.61091791264422, 2.77258872223978, 1.6094379124341, 3.66356164612965, 
2.07944154167984, 2.56494935746154, 3.17805383034795, 3.49650756146648, 
3.55534806148941, 1.6094379124341, 1.79175946922805, 1.09861228866811, 
3.36729582998647, 3.66356164612965, 3.25809653802148, 2.83321334405622, 
0, 1.6094379124341), u_visits_log = c(9.57948721741024, 10.3630619862913, 
9.21283725217477, 9.25320827220336, 9.14163317396639, 11.059361953371, 
9.60777330838708, 10.1300652005846, 10.1758783064189, 10.5054784140542, 
9.99410492011974, 10.1530393171266, 8.69399956752208, 9.89726825272221, 
8.10621290261996, 8.08147504013705, 8.00235954625271, 8.71177264560569, 
8.19478163844336, 10.0943558152985, 8.96852355539635, 8.00436556497957, 
7.1929342212158, 8.78293635634926, NA, 9.12826226094058, 9.24067557177226, 
9.70570712067963, 7.69893619981345, 8.10892415597534, 8.36474106822456, 
7.85205020726589, 8.21256839823415, 10.3358217647463, 10.0170840591187, 
10.5443145958964, 9.39499254410842, 9.76634966431749, 9.30455904721516, 
9.31388924760474, 9.78667279794429, 9.71359742624222, 10.8417354517623, 
10.7610468693465, 10.6929217084076, 9.46846489218564, 9.55577234996887, 
9.15122710748368, 9.46436224293533, 7.45645455517621, 9.28191657046541, 
10.7124825641179, 11.0173332218448, 10.7947271075434, 9.08658956454001, 
10.9463052028819, 10.0362686637699, 9.72454029638112, 8.9182485910357, 
10.1051214203625, 9.06912237006065, 9.04511189260841, 9.48637993264389, 
NA, 10.0558222988394, 8.38753998318937, 11.7695502129567, 11.6227748903803, 
11.5969540645657, 8.43446354381724, 9.57463620304628, 9.92568926051981, 
9.01152351265303, 8.22710823434815, 8.25035895147729, 9.30273722124215, 
8.29903718161307, 9.92054130710196, 9.46039845583127, 9.66839793020352, 
10.2900416357525, 10.7574772582129, 9.33899740418445, 10.5668969148321, 
9.29109052166129, 9.35513311687772, 10.4307277175702, 9.19542975924043, 
8.56483984488359, 9.93842023907615, 10.7684639371695, 10.6115479758284, 
8.85295088709958, 7.91095738284559, 8.53601494565683, 5.39362754635236, 
7.40062057737113, 7.50163445788341, 7.60090245954208, 10.724456104534, 
10.8393455899028, 10.6343158000638, 9.28256800597306, 8.36683530982767, 
6.60258789218934, 8.27868216297091, 8.68794811183873, 8.84548923675327, 
9.02701831484864, 9.04735074348172, 8.55140136274597, 8.90463009700501, 
8.2398574110186, 6.21060007702465, 8.06400734709666, 8.21148291644507, 
8.60940767540405, 8.30251371851416, 8.74193546409414, 8.10107150311954, 
7.83676478326407, 9.29587566008245, 9.33034316437088, 9.17730051789798, 
6.3578422665081, 7.96970358327866), weight = c(0.437515288638368, 
0.437515288638368, 0.437515288638368, 1.88537623578068, 1.88537623578068, 
1.88537623578068, 0.534343902248553, 0.534343902248553, 0.534343902248553, 
0.437515288638368, 0.437515288638368, 0.437515288638368, 4.28096628920089, 
4.28096628920089, 4.28096628920089, 1.67728197359262, 1.67728197359262, 
1.67728197359262, 0.443869769696834, 0.443869769696834, 0.443869769696834, 
0.437515288638368, 0.437515288638368, 0.437515288638368, 0.43055025116271, 
0.43055025116271, 0.43055025116271, 0.579762810877004, 0.579762810877004, 
0.579762810877004, 0.437515288638368, 0.437515288638368, 0.437515288638368, 
0.533307285159432, 0.533307285159432, 0.533307285159432, 0.606609491055402, 
0.606609491055402, 0.606609491055402, 1.67728197359262, 1.67728197359262, 
1.67728197359262, 0.436803571873503, 0.436803571873503, 0.436803571873503, 
0.437515288638368, 0.437515288638368, 0.437515288638368, 0.742068296743283, 
0.742068296743283, 0.742068296743283, 0.436803571873503, 0.436803571873503, 
0.436803571873503, 0.402457843831888, 0.402457843831888, 0.402457843831888, 
2.36315078579042, 2.36315078579042, 2.36315078579042, 1.34429397469877, 
1.34429397469877, 1.34429397469877, 0.581431050826477, 0.581431050826477, 
0.581431050826477, 0.396050904377531, 0.396050904377531, 0.396050904377531, 
0.57310722962594, 0.57310722962594, 0.57310722962594, 0.402457843831888, 
0.402457843831888, 0.402457843831888, 0.689923879462086, 0.689923879462086, 
0.689923879462086, 0.396050904377531, 0.396050904377531, 0.396050904377531, 
0.606609491055402, 0.606609491055402, 0.606609491055402, 0.43055025116271, 
0.43055025116271, 0.43055025116271, 0.579762810877004, 0.579762810877004, 
0.579762810877004, 0.563983632267003, 0.563983632267003, 0.563983632267003, 
0.396050904377531, 0.396050904377531, 0.396050904377531, 0.57310722962594, 
0.57310722962594, 0.57310722962594, 0.967620261131432, 0.967620261131432, 
0.967620261131432, 1.06998665548284, 1.06998665548284, 1.06998665548284, 
0.699944348089522, 0.699944348089522, 0.699944348089522, 0.436803571873503, 
0.436803571873503, 0.436803571873503, 0.518792468498778, 0.518792468498778, 
0.518792468498778, 0.43055025116271, 0.43055025116271, 0.43055025116271, 
0.742068296743283, 0.742068296743283, 0.742068296743283, 0.742068296743283, 
0.742068296743283, 0.742068296743283, 1.76090896855322, 1.76090896855322, 
1.76090896855322), wave_weight = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), ft_govopp = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
)), class = "data.frame", row.names = c(NA, -126L))

The function will fail:

summary(mix(ft_party_opp ~ news_u_log + u_visits_log + (1 | person_id), dt, weights=c("wave_weight", "weight")))

Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'object' in selecting a method for function 'summary': 0 (non-NA) cases
In addition: Warning message:
In mix(formula1_visits, dt3, weights = c("wave_weight", "weight")) :
  There were 126 rows with missing data. These have been removed.

However, after dropping the one column that is NA for all obs, it works:

dt2 <- dt %>% select(-ft_govopp)
summary(mix(ft_party_opp ~ news_u_log + u_visits_log + (1 | person_id), dt, weights=c("wave_weight", "weight")))

I suspect this is caused by line 124 in the file adaptiveQuad.R. I don't know whether it is intended, but in other similar packages, i.e. lme4, only relevant variables are checked for NA.

pdbailey0 commented 2 years ago

@BernhardClemm thanks for the excellent, reproducible example!

I updated the version on GitHub to call model.frame with just the relevant variables, so you can use the instructions on the main page to install that. Your example should run once you do that.

Also, please be sure to confirm I spelled your name correctly in the NEWS as well.