American-Institutes-for-Research / WeMix

WeMix public repository
GNU General Public License v2.0
10 stars 1 forks source link

WeMix error #8

Closed Emiliefranck closed 11 months ago

Emiliefranck commented 1 year ago

Dear all,

I am trying to run a multilevel model with students nested into schools. I am using PISA data. The WeMix package is especially interesting since it allows for several weight-level variables.

Everything runs smoothly, however when running the final multilevel model I get an error. Please find below the code and the error.

Many thanks in advance!

Kind regards,

Emilie Franck

CODE:

install.packages("lme4")
install.packages("misty")
install.packages("ggplot2")
install.packages("tidyverse")
install.packages("foreign")
install.packages("mice")
install.packages("Hmisc")
install.packages("dplyr")
install.packages("haven")
# Load misty, lme4, nlme, and ggplot2 package
library(misty)
library(lme4)
library(nlme)
library(ggplot2)
library(tidyverse)
library(foreign)
library(mice)
library(Hmisc)
library(dplyr)
library(haven)

#get names
names(edataa)

# Cluster mean centering, center() from the misty package
edataa$SCH.ESCS.c <- center(edataa$SCH.ESCS, type = "CGM",
                            cluster = edataa$CNTSCHID)

edataa$SCH.IMMIG.c <- center(edataa$SCH.IMMIG, type = "CGM",
                             cluster = edataa$CNTSCHID)

edataa$SCH.LANG.c <- center(edataa$SCH.lang, type = "CGM",
                            cluster = edataa$CNTSCHID)

edataa$ESCS.c <- center(edataa$ESCS, type = "CWC",
                        cluster = edataa$CNTSCHID)

install.packages("WeMix")
library(WeMix)

#prepare weights
edataa$sqw <- edataa$W.FSTUWT^2
sumsqw <- aggregate(sqw ~ CNTSCHID, data = edataa, sum)
sumw <- aggregate(W.FSTUWT ~ CNTSCHID, data = edataa, sum)
edataa$sumsqw <- sapply(edataa$CNTSCHID, function(s) sumsqw$sqw[sumsqw$CNTSCHID == s])
edataa$sumw <- sapply(edataa$CNTSCHID, function(s) sumw$W.FSTUWT[sumw$CNTSCHID == s])
edataa$pwt1 <- edataa$W.FSTUWT * (edataa$sumw/edataa$sumsqw)
edataa$pwt2 <- edataa$W.SCHGRNRABWT

# Estimate multilevel model using the wemix package
mod1a <- mix(PV1MATH ~ ESCS.c + immig.D + language + SCH.LANG.c+ SCH.IMMIG.c+ SCH.ESCS.c+(1| CNTSCHID), data = edataa,
             weights=c("pwt1","pwt2"))

Error: not-yet-implemented method for *(<labelled>, <dgCMatrix>).
 ->>  Ask the package authors to implement the missing feature.

[edited by PB to put code in code block]

pdbailey0 commented 1 year ago

Thank you @Emiliefranck can you share where edataa comes from?

pdbailey0 commented 1 year ago

Have you looked at the vignette (in WeMix) "Introduction to Weighted Mixed-Effects Models With WeMix"? It uses PISA as an example and has the R code for doing so in it. You are using a different weighting scheme (and probably year) but it could be a good place to start.

Emiliefranck commented 1 year ago

Dear Paul,

Many thanks for your reply.

Yes i tried it first from there, afterwards i tried changing mt weights since i thought this might be the problem.

My original syntax was:


#get names
names(edataa)

# Cluster mean centering, center() from the misty package
edataa$SCH.ESCS.c <- center(edataa$SCH.ESCS, type = "CGM",
                            cluster = edataa$CNTSCHID)

edataa$SCH.IMMIG.c <- center(edataa$SCH.IMMIG, type = "CGM",
                             cluster = edataa$CNTSCHID)

edataa$SCH.LANG.c <- center(edataa$SCH.lang, type = "CGM",
                            cluster = edataa$CNTSCHID)

edataa$ESCS.c <- center(edataa$ESCS, type = "CWC",
                        cluster = edataa$CNTSCHID)

install.packages("WeMix")
library(WeMix)

# Estimate multilevel model using the WeMix package
mod1a <- mix(PV1MATH ~ ESCS.c + immig.D + language + SCH.LANG.c+ SCH.IMMIG.c+ SCH.ESCS.c+(1| CNTSCHID), data = edataa,
             weights=c("W.FSTUWT","W.SCHGRNRABWT"))

In this syntax I did not rescale my weight variables.

I will send you the edataa!

Many thanks in advance,

Emilie Franck

[edited by PB to put code in code blocks.]

pdbailey0 commented 1 year ago

Let's keep it simple and keep all the information we need here. What year and country is this? How did you prep these columns?

Can you simplify the model some to see if the error is still there? You could try dropping covariates, for example. Here's a helpful guide.

Emiliefranck commented 1 year ago

Dear Paul,

Many thanks for this suggestion! I have tried a simple model with solely the dependent PVmath variable and the ESCS variable created by PISA (without centering them or any other data manipulation). I still received the same error. This specific country was Albania, however, I'm running it for 80 countries and I get the same error for all countries.

The date is from 2018!

Kind regards,

Emilie Franck

pdbailey0 commented 1 year ago

@Emiliefranck I'm using EdSurvey to download and read the data, and I cannot reproduce this issue.

require(EdSurvey)
require(WeMix)
downloadPISA("~/", years=2018)
pisaalb <- readPISA("~/PISA/2018/", countries="alb")
# Found cached data for country code “alb”
alb18 <- EdSurvey::getData(data=pisaalb, varnames=c("math", "escs", "cntschid", "w_fstuwt", "w_schgrnrabwt"), omittedLevels=FALSE)

m1 <- mix(pv1math ~ escs + (1|cntschid), data=alb18, weights=c("w_fstuwt", "w_schgrnrabwt"))
# Warning message:
# In mix(pv1math ~ escs + (1 | cntschid), data = alb18, weights = c("w_fstuwt",  :
#   There were 82 rows with missing data. These have been removed.
summary(m1)
# Call:
# mix(formula = pv1math ~ escs + (1 | cntschid), data = alb18, 
#     weights = c("w_fstuwt", "w_schgrnrabwt"))

# Variance terms:
#  Level    Group        Name Variance Std. Error Std.Dev.
#      2 cntschid (Intercept)    985.4      108.0    31.39
#      1 Residual               5440.4      125.7    73.76
# Groups:
#  Level    Group n size mean wgt sum wgt
#      2 cntschid    327    4.922    1609
#      1      Obs   6277    4.391   27564

# Fixed Effects:
#             Estimate Std. Error t value
# (Intercept)  439.180      3.024 145.208
# escs           5.931      1.405   4.221

# lnl= -158496.78 
# Intraclass Correlation= 0.1534 

Can you help me out with what you did?

[updated by PB to add require calls]

pdbailey0 commented 1 year ago

Another note is that the error looks like it is in the Matrix package. You could update that by closing all R sessions and installing Matrix again to see if that fixes it.

pdbailey0 commented 1 year ago

@Emiliefranck any more info on this error? Ideally, you would share code so we can reproduce it.

pdbailey0 commented 1 year ago

@Emiliefranck can you help me reproduce this, starting with downloaded data?

Emiliefranck commented 1 year ago

Dear Paul,

My apologies for my absence.

I have tried again by remaking the data. It worked once, although I get the same error again. I am assuming the mistake is somewhere in the changing of files from spss to R. I am going to look into this and keep you updated.

Kind regards,

Emilie Franck

pdbailey0 commented 1 year ago

@Emiliefranck another user reported problems with haven did you use that for the import?

pdbailey0 commented 1 year ago

@Emiliefranck I maybe fixed it with the latest WeMix. Can you try installing it and see if the problem is gone?

install.packages("devtools")
devtools::install_github("American-Institutes-for-Research/WeMix")

I'm planning on releasing a new version and would like to make sure I have this fixed.

Emiliefranck commented 1 year ago

Dear Paul,

I have tried this but now I get another error:

mod1a <- mix(PV1MATH ~ ESCS+ immig.D + language + SCH.lang+ SCH.IMMIG+ SCH.ESCS + (1| CNTSCHID), data = edataa, weights=c("W.FSTUWT","W.SCHGRNRABWT")) Error in Diagonal(x = (W0)) : 'x' has unsupported class "labelled"

Kind regards,

Emilie Franck

pdbailey0 commented 1 year ago

@Emiliefranck that is from the Matrix package Diagonal constructor here.

I tried to make WeMix robust to this. You can again try

devtools::install_github("American-Institutes-for-Research/WeMix")

to see if it works now.

pdbailey0 commented 11 months ago

@Emiliefranck if you have any follow ups, please reopen this issue or just reply here. Otherwise, it seems this has gone cold. I also believe we solved the problem you identified related to data frames that are incompatible with the Matrix package.