rvlenth / emmeans

Estimated marginal means
https://rvlenth.github.io/emmeans/
340 stars 30 forks source link

How to qdrg() + multgee + multiple imputation: Error in X[ii, ii, drop = FALSE] %*% y[ii] : non-conformable arguments #498

Closed Generalized closed 1 day ago

Generalized commented 3 days ago

Dear @rvlenth

I'm trying to git GEE-fit ordinal regression from the multgee package and follow it with emmeans under multiple imputation. Neither multgee is supported, nor itself it "talks" with mice... So a number of workarounds are necessary, but still I cannot make it work.

I use the data provided here. It's R RDS file, as the imputed dataset is too big to paste it in a textual form and it takes too much time to program it from scratch.

imp_data_mids.zip or here - with an easier structure converted to just list, will be required for qdrg(): imp_data_list.zip

Description: It's a longitudinal study with 2 arms (HD, SoC), 6 visits (week3...month 20), some ordinal data ODIPain and a few covariates. Analyses: 1) between-arm comparisons at each timepoint - this works 2) within-arm trend over time - this fails

Between-arm comparisons at each timepoint

/ This one works, but I wanted to make some context /

I started with the simplest approach using function with(), but the ordLORgee doesn't accept this way: (I asked the author to provide integration with MICE and emmeans: https://github.com/AnestisTouloumis/multgee/issues/7 and https://github.com/AnestisTouloumis/multgee/issues/8 )

> ord_gee_imp <- with(imp_data_longit,
+   ordLORgee(formula  = ODIPain ~ Visit_nOrd * Arm + Age_centered + Age_centered : Visit_nOrd,
+             id       = PatientId, 
+             repeated = as.numeric(Visit_nOrd), 
+             LORstr   = "category.exch",
+             link = "logit")
+ )
Error in eval(predvars, data, env) : object 'ODIPain' not found

So I went this way:

imp_data_longit_list <- complete(imp_data_longit, "all")

  ord_gee_imp <- lapply(imp_data_longit_list, 
                        function(dat) {
                          ordLORgee(formula  = ODIPain ~ Visit_nOrd * Arm + Age_centered + Age_centered : Visit_nOrd,
                                    data     = dat, 
                                    id       = PatientId, 
                                    repeated = as.numeric(Visit_nOrd), 
                                    LORstr   = "category.exch",
                                    link = "logit")
                        })

# This function ignores degrees of freedom, here I work under infinite DF
pooled_params <- pool_estimates_for_qdrg(k=20, 
                        coefficients = lapply(ord_gee_imp, function(analysis) analysis$coefficients),
                        varcov       = lapply(ord_gee_imp, function(analysis) analysis$robust.variance))

ord_gee_imp_grid <- with(pooled_params ,
                         qdrg(formula = ODIPain ~ Visit_nOrd * Arm + Age_centered + Age_centered : Visit_nOrd,
                              data=imp_data_longit_list[[1]], # fake data in any case 
                              coef = pooled_coef * c(rep(1, 5), rep(-1, 8)), # sign reversed
                              vcov = pooled_VC, 
                              df = Inf, 
                              ordinal = list(dim=6)))

(ord_gee_imp_em <- emmeans(ord_gee_imp_grid, specs = ~Arm * Visit_nOrd))

#and finally:
 Arm Visit_nOrd emmean    SE  df asymp.LCL asymp.UCL
 HD  Month 6    -0.992 0.247 Inf     -1.48   -0.5090
 SoC Month 6    -1.055 0.522 Inf     -2.08   -0.0319
 HD  Month 12   -1.149 0.399 Inf     -1.93   -0.3679
 SoC Month 12   -0.704 0.536 Inf     -1.76    0.3473
 HD  Month 20   -1.402 0.430 Inf     -2.25   -0.5588
 SoC Month 20   -0.565 0.534 Inf     -1.61    0.4809

Confidence level used: 0.95 

> update(contrast(ord_gee_imp_em ,
+                 list(                        #M6     M12    M20
+                   "Month 6  : SoC vs. HD" = c(-1,1,   0,0,   0,0),
+                   "Month 12 : SoC vs. HD" = c(0,0,  -1,1,   0,0),
+                   "Month 20 : SoC vs. HD" = c(0,0,   0,0,  -1,1)
+                 )),
+        adjust="mvt", level = 0.95, infer = c(TRUE, TRUE))
 contrast              estimate    SE  df asymp.LCL asymp.UCL z.ratio p.value
 Month 6  : SoC vs. HD  -0.0629 0.332 Inf   -0.8440     0.718  -0.190  0.9954
 Month 12 : SoC vs. HD   0.4449 0.377 Inf   -0.4414     1.331   1.182  0.5062
 Month 20 : SoC vs. HD   0.8368 0.356 Inf   -0.0015     1.675   2.350  0.0507

Confidence level used: 0.95 
Conf-level adjustment: mvt method for 3 estimates 
P value adjustment: mvt method for 3 tests 

OK. So this part of analysis works. Now I wanted to perform the within-arm analysis of trends:


Within-arm trends

I refit the model now with Visit being an ordered factor:

ord_gee_trend_imp <- lapply(imp_data_longit_list, 
                            function(dat) {

                                ordLORgee(formula  = ODIPain ~ Visit * Arm + Age_centered + Age_centered : Visit,
                                          data     = dat, 
                                          id       = PatientId, 
                                          repeated = as.numeric(Visit), 
                                          LORstr   = "category.exch",
                                          link = "logit")
                            })

(pooled_coefficients <- pool_estimates_for_qdrg(k=20,
                                                coefficients = lapply(ord_gee_trend_imp, function(analysis) analysis$coefficients),
                                                varcov       = lapply(ord_gee_trend_imp, function(analysis) vcov(analysis, type = "robust"))))

> pooled_coefficients$pooled_coef
              beta10               beta20               beta30               beta40               beta50 
         -1.14894182          -0.15798862           1.23883563           2.24774986           3.72561468 
             Visit.L              Visit.Q               ArmSoC         Age_centered       Visit.L:ArmSoC 
          0.28984189           0.03919969          -0.40627726           0.03240021          -0.63618847 
      Visit.Q:ArmSoC Visit.L:Age_centered Visit.Q:Age_centered 
          0.04732131           0.01766259          -0.01329275 
> 
> head(pooled_coefficients$pooled_VC)
             beta10      beta20      beta30      beta40      beta50     Visit.L       Visit.Q       ArmSoC Age_centered
beta10  0.047549419 0.038277906 0.030243589 0.028676557 0.026499953 0.003128325 -0.0040737307 -0.036122643 0.0002089025
beta20  0.038277906 0.046871768 0.037963592 0.036005979 0.032474648 0.002684273 -0.0065353294 -0.040875379 0.0001037114
beta30  0.030243589 0.037963592 0.050362772 0.048894056 0.047319362 0.004115861 -0.0053274347 -0.040906619 0.0004579626
beta40  0.028676557 0.036005979 0.048894056 0.070326648 0.071557656 0.002618757 -0.0038463331 -0.042795163 0.0006407785
beta50  0.026499953 0.032474648 0.047319362 0.071557656 0.196631280 0.002941239 -0.0054336792 -0.036282135 0.0002631419
Visit.L 0.003128325 0.002684273 0.004115861 0.002618757 0.002941239 0.033275488  0.0000139466 -0.003191944 0.0001063440
        Visit.L:ArmSoC Visit.Q:ArmSoC Visit.L:Age_centered Visit.Q:Age_centered
beta10    -0.001256201    0.004730693         1.988158e-05         1.119622e-04
beta20    -0.002172895    0.006734410         1.330267e-04        -5.919053e-05
beta30    -0.005490518    0.005421506         1.183854e-04        -1.047165e-04
beta40    -0.011313042    0.002700929         1.469726e-04        -2.351656e-04
beta50    -0.006681569    0.006889773        -1.905667e-04        -8.356725e-05
Visit.L   -0.033722643    0.001071289         8.616870e-04         9.072889e-05

So far so good. Please note that the "L", "Q" and "C" components are present in the coefficients.

(mmrm_gee_imp_trend_grid <- with(pooled_coefficients,
                          qdrg(formula = ODIPain ~ Visit * Arm + Age_centered + Age_centered : Visit,
                               data=imp_data_longit_list[[1]], # fake data in any case
                               coef = pooled_coef,
                               vcov = pooled_VC)))

'emmGrid' object with variables:
    Visit = Month 6, Month 12, Month 20
    Arm = HD, SoC
    Age_centered = -2.4336e-15

(m_longit_em_trend <- emmeans(mmrm_gee_imp_trend_grid, specs = ~ Visit | Arm, adjust="none"))

Error in X[ii, ii, drop = FALSE] %*% y[ii] : non-conformable arguments

So it failed.

But when I did this analysis using GEE for numerical data (treating ordinal response as numeric) there was no problem and the ordered Visit factor was properly recognized:

> m_longit_trend <- with(imp_data_longit, 
                        glmtoolbox::glmgee(ODIPain ~ Visit * Arm + Age_centered + Age_centered : Visit,
                                           family = gaussian(link = "identity"),
                                           id = PatientId, 
                                           corstr = "unstructured"))

(m_longit_em_trend <- emmeans(as.mira(m_longit_trend), regrid="response", 
                             specs = ~ Visit | Arm, adjust="none", df=Inf))
Arm = HD:
 Visit    emmean    SE  df asymp.LCL asymp.UCL
 Month 6    1.79 0.169 Inf      1.46      2.12
 Month 12   1.68 0.183 Inf      1.32      2.04
 Month 20   1.49 0.173 Inf      1.15      1.83

Arm = SoC:
 Visit    emmean    SE  df asymp.LCL asymp.UCL
 Month 6    1.74 0.177 Inf      1.40      2.09
 Month 12   2.00 0.202 Inf      1.60      2.39
 Month 20   2.08 0.191 Inf      1.71      2.46

Covariance estimate used: robust 
Degrees-of-freedom method: user-specified 
Confidence level used: 0.95 

> update(contrast(m_longit_em_trend, "poly", max.degree=2, by = "Arm"), adjust="none", level = 0.95, infer = c(TRUE, TRUE))
Arm = HD:
 contrast  estimate    SE  df asymp.LCL asymp.UCL z.ratio p.value
 linear     -0.2978 0.187 Inf   -0.6643    0.0688  -1.592  0.1114
 quadratic  -0.0762 0.282 Inf   -0.6298    0.4775  -0.270  0.7874

Arm = SoC:
 contrast  estimate    SE  df asymp.LCL asymp.UCL z.ratio p.value
 linear      0.3386 0.205 Inf   -0.0627    0.7399   1.654  0.0981
 quadratic  -0.1724 0.341 Inf   -0.8409    0.4960  -0.506  0.6131

Degrees-of-freedom method: user-specified 
Confidence level used: 0.95 

Let's try with GEE for numerical data via qdrg()

m_longit_trend <- lapply(imp_data_longit_list, 
                      function(dat) {
                        glmtoolbox::glmgee(ODIPain ~ Visit * Arm + Age_centered + Age_centered : Visit,
                                           family = gaussian(link = "identity"),
                                           id = PatientId, 
                                           data=dat,
                                           corstr = "unstructured")})

(pooled_coef <- pool_estimates_for_qdrg(k=20,
                                        coefficients = lapply(m_longit_trend, function(analysis) analysis$coefficients),
                                        varcov       = lapply(m_longit_trend, function(analysis) vcov(analysis, type = "robust"))))

ord_gee_imp_grid <- with(pooled_coefficients ,
                         qdrg(formula = ODIPain ~ Visit * Arm + Age_centered + Age_centered : Visit,,
                              data=imp_data_longit_list[[1]], # fake data in any case 
                              coef = pooled_coef,
                              vcov = pooled_VC, 
                              df = Inf))

m_longit_em_trend <- emmeans(ord_gee_imp_grid, 
                             specs = ~ Visit | Arm, 
                             adjust="none",
                             df=Inf,
                             coef = pooled_coef$pooled_coef,
                             vcov = pooled_coef$pooled_VC) 

update(contrast(m_longit_em_trend, "poly", max.degree=2, by = "Arm"), adjust="none", level = 0.95, infer = c(TRUE, TRUE))
Arm = HD:
 contrast  estimate SE  df asymp.LCL asymp.UCL z.ratio p.value
 linear           0  0 Inf         0         0     NaN     NaN
 quadratic        0  0 Inf         0         0     NaN     NaN

Arm = SoC:
 contrast  estimate SE  df asymp.LCL asymp.UCL z.ratio p.value
 linear           0  0 Inf         0         0     NaN     NaN
 quadratic        0  0 Inf         0         0     NaN     NaN

Degrees-of-freedom method: user-specified 
Confidence level used: 0.95 

It fails.

PS: the pooling function - adopted from your code


pool_estimates_for_qdrg <- function(k, coefficients, varcov) {

  V <- 1/k * varcov[[1]]
  allb <- cbind(coefficients[[1]], matrix(0, nrow = length(coefficients[[1]]), ncol = k - 1))

  for (i in 1 + seq_len(k - 1)) {
    V <- V + 1/k * varcov[[i]]
    allb[, i] <- coefficients[[i]]
  }

  pooled_coef <- apply(allb, 1, mean)
  notna <- which(!is.na(pooled_coef))
  pooled_VC <- V + (k + 1)/k * cov(t(allb[notna, , drop = FALSE])) 

  return(list(pooled_coef = pooled_coef, pooled_VC = pooled_VC))
}
Generalized commented 3 days ago

It seems that qrdg() struggles wih ordinal factors: In the example it did not take the trends, but rather levels:

In the model: obraz

> mmrm_gee_imp_trend_grid@grid
     Visit Arm  Age_centered .wgt.
1  Month 6  HD -2.433632e-15    56
2 Month 12  HD -2.433632e-15    56
3 Month 20  HD -2.433632e-15    56
4  Month 6 SoC -2.433632e-15    55
5 Month 12 SoC -2.433632e-15    55
6 Month 20 SoC -2.433632e-15    55

> mmrm_gee_imp_trend_grid@bhat
               (Intercept)              VisitMonth 12              VisitMonth 20                     ArmSoC 
                        NA                         NA                         NA                -0.40627726 
              Age_centered       VisitMonth 12:ArmSoC       VisitMonth 20:ArmSoC VisitMonth 12:Age_centered 
                0.03240021                         NA                         NA                         NA 
VisitMonth 20:Age_centered 
                        NA 

So I guess I should tell somehow qdrg() that Visit is ordinal. But how?

rvlenth commented 3 days ago

I have no clue, as I don't know this package nor what code you used. You have a bunch of issues open and I'm getting them mixed up as I am traveling. This one does need a reproducible example.

Generalized commented 3 days ago

I understand. Yes, doing it via email will be difficult, I will wait and we can continue when you return and have the access to github. I provided a fully reproducible example with lots of details and attached data in the post (ZIP files). The 2 other topics don't need any examples, they are not about any code.

jpiaskowski commented 2 days ago

Please provide a reproducible example that does not require us to download external data. I am sorry this will take time, but so does package maintenance and answering issues. Here are some guidelines.

Generalized commented 2 days ago

The raw data:

d <- structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 
3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 
9L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 
13L, 13L, 14L, 14L, 14L), levels = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9", "10", "11", "12", "13", "14"), class = "factor"), 
    Arm = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L), levels = c("A", "B"), class = "factor"), Visit = structure(c(1L, 
    2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
    2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
    2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), levels = c("V1", 
    "V2", "V3"), class = c("ordered", "factor")), Visit_non_ord = structure(c(1L, 
    2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
    2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
    2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), levels = c("V1", 
    "V2", "V3"), class = "factor"), Painscore = c(1L, 2L, 4L, 
    5L, 6L, NA, 4L, 2L, 3L, 4L, 3L, 4L, 5L, NA, 5L, 6L, 7L, 6L, 
    4L, 3L, 5L, 1L, NA, 4L, 5L, 4L, 5L, 4L, 6L, NA, 0L, 4L, 3L, 
    4L, NA, 4L, 3L, 4L, NA, 6L, 7L, 8L)), row.names = c(NA, -42L
), class = "data.frame")

Imputation:

library(mice)
library(glmtoolbox)

imp <- mice(d, m=5)
imp$predictorMatrix[1:4, 1:5] <- 0
imp$predictorMatrix[5, 1] <- 0
imp <- mice(d, m=5, predictorMatrix = imp$predictorMatrix)
imp_list <- complete(imp, "all")

Analysis.

Fitting GEE

(trend_fit <- lapply(imp_list, 
                     function(dat) {
                       glmgee(Painscore ~ Visit * Arm,
                              family = gaussian(link = "identity"),
                              id = ID, 
                              data=dat,
                              corstr = "unstructured")}))

$`1`

        Variance function:  gaussian
                     Link:  identity
    Correlation structure:  Unstructured 

$`2`

        Variance function:  gaussian
                     Link:  identity
    Correlation structure:  Unstructured 

$`3`

        Variance function:  gaussian
                     Link:  identity
    Correlation structure:  Unstructured 

$`4`

        Variance function:  gaussian
                     Link:  identity
    Correlation structure:  Unstructured 

$`5`

        Variance function:  gaussian
                     Link:  identity
    Correlation structure:  Unstructured 

Warning messages:
1: Estimate of correlation matrix is not positive definite 
2: Estimate of correlation matrix is not positive definite 
3: Estimate of correlation matrix is not positive definite 
4: Estimate of correlation matrix is not positive definite 
5: Estimate of correlation matrix is not positive definite 

Pooling:

(pooled_coef <- pool_estimates_for_qdrg(k=5,
                                        coefficients = lapply(trend_fit, 
                                                              function(analysis) analysis$coefficients),
                                        varcov       = lapply(trend_fit, 
                                                              function(analysis) vcov(analysis, type = "bias-corrected"))))

$pooled_coef
 (Intercept)      Visit.L      Visit.Q         ArmB Visit.L:ArmB Visit.Q:ArmB 
  4.15238095   0.24243661   0.39658405   0.07619048   0.82832509  -0.85148929 

$pooled_VC
             (Intercept)       Visit.L     Visit.Q         ArmB Visit.L:ArmB  Visit.Q:ArmB
(Intercept)   0.31654321 -0.0763653945 -0.14233455 -0.292597632  0.121677951  0.1621637503
Visit.L      -0.07636539  0.1855328798  0.03003269  0.102917976 -0.131043084 -0.0006938677
Visit.Q      -0.14233455  0.0300326875  0.17368103  0.129670519 -0.046646236 -0.1714361300
ArmB         -0.29259763  0.1029179757  0.12967052  0.826087176  0.002528054 -0.0012003672
Visit.L:ArmB  0.12167795 -0.1310430839 -0.04664624  0.002528054  0.626938776  0.0707090432
Visit.Q:ArmB  0.16216375 -0.0006938677 -0.17143613 -0.001200367  0.070709043  0.4825547997

Making the grid:

 (emm_grid <- with(pooled_coef,
                  qdrg(formula = Painscore ~ Visit * Arm,
                       data=imp_list[[1]],
                       coef = pooled_coef,
                       vcov = pooled_VC, 
                       df = Inf)))

'emmGrid' object with variables:
    Visit = V1, V2, V3
    Arm = A, B

emmeans()

(emm_trends <- emmeans(emm_grid, 
                       specs = ~ Visit | Arm, 
                       adjust="none",
                       df=Inf,
                       coef = pooled_coef$pooled_coef,
                       vcov = pooled_coef$pooled_VC))

Arm = A:
 Visit emmean    SE  df asymp.LCL asymp.UCL
 V1      4.15 0.563 Inf      3.05      5.26
 V2      4.15 0.563 Inf      3.05      5.26
 V3      4.15 0.563 Inf      3.05      5.26

Arm = B:
 Visit emmean    SE  df asymp.LCL asymp.UCL
 V1      4.23 0.747 Inf      2.77      5.69
 V2      4.23 0.747 Inf      2.77      5.69
 V3      4.23 0.747 Inf      2.77      5.69

Degrees-of-freedom method: user-specified 
Confidence level used: 0.95 

Contrast:

(contrast(emm_trends, "poly", max.degree=2, by = "Arm"))

Arm = A:
 contrast  estimate SE  df z.ratio p.value
 linear           0  0 Inf     NaN     NaN
 quadratic        0  0 Inf     NaN     NaN

Arm = B:
 contrast  estimate SE  df z.ratio p.value
 linear           0  0 Inf     NaN     NaN
 quadratic        0  0 Inf     NaN     NaN

Degrees-of-freedom method: user-specified 

It failed.

Without qdrg() it works. But since the package for GEE ordinal logistic regression isn't supported by emmeans, I need to go through qdrg().

qdrg() is also necessary if I want to specify special features for GEE, e.g. using specific covariance estimator. For example, it doesn't support fully the glmtoolbox::glmgee() - I cannot specify the var-cov estimator I need.

Fitting GEE

(trend_fit <- with(imp,
                   glmgee(Painscore ~ Visit * Arm,
                          family = gaussian(link = "identity"),
                          id = ID, 
                          corstr = "unstructured")))  # OK, but I cannot specify the bias-corrected covariance this way :/

call :
with.mids(data = imp, expr = glmgee(Painscore ~ Visit * Arm, 
    family = gaussian(link = "identity"), id = ID, corstr = "unstructured"))

call1 :
mice(data = d, m = 5, predictorMatrix = imp$predictorMatrix)

nmis :
           ID           Arm         Visit Visit_non_ord     Painscore 
            0             0             0             0             6 

analyses :
[[1]]

        Variance function:  gaussian
                     Link:  identity
    Correlation structure:  Unstructured 

[[2]]

        Variance function:  gaussian
                     Link:  identity
    Correlation structure:  Unstructured 

[[3]]

        Variance function:  gaussian
                     Link:  identity
    Correlation structure:  Unstructured 

[[4]]

        Variance function:  gaussian
                     Link:  identity
    Correlation structure:  Unstructured 

[[5]]

        Variance function:  gaussian
                     Link:  identity
    Correlation structure:  Unstructured 

Warning messages:
1: Estimate of correlation matrix is not positive definite 
2: Estimate of correlation matrix is not positive definite 
3: Estimate of correlation matrix is not positive definite 
4: Estimate of correlation matrix is not positive definite 
5: Estimate of correlation matrix is not positive definite 

emmeans()

 (emm_trends <- emmeans(trend_fit, 
                        specs = ~ Visit | Arm, 
                        adjust="none",
                        df=Inf))

Arm = A:
 Visit emmean    SE  df asymp.LCL asymp.UCL
 V1      4.14 0.551 Inf      3.06      5.22
 V2      3.83 0.702 Inf      2.45      5.20
 V3      4.49 0.426 Inf      3.65      5.32

Arm = B:
 Visit emmean    SE  df asymp.LCL asymp.UCL
 V1      3.29 0.748 Inf      1.82      4.75
 V2      4.60 0.647 Inf      3.33      5.87
 V3      4.80 1.034 Inf      2.77      6.83

Covariance estimate used: robust 
Degrees-of-freedom method: user-specified 
Confidence level used: 0.95 

Contrast:

(contrast(emm_trends, "poly", max.degree=2, by = "Arm"))

Arm = A:
 contrast  estimate    SE  df z.ratio p.value
 linear       0.343 0.533 Inf   0.643  0.5202
 quadratic    0.971 0.879 Inf   1.105  0.2693

Arm = B:
 contrast  estimate    SE  df z.ratio p.value
 linear       1.514 0.980 Inf   1.545  0.1222
 quadratic   -1.114 1.207 Inf  -0.923  0.3560

Degrees-of-freedom method: user-specified 

OK, it works this way.

( Though the covariance estimator is not the one I want, so this example doesn't solve my problem; qdrg() is necessary. )

rvlenth commented 2 days ago

This is a whole lot to wade through, and at the end ("OK, it works this way") it sounds like it doesn't reproduce the error. If there's more to it than that, maybe I can look at it again after I get back from a trip in late July.

rvlenth commented 2 days ago

@jpiaskowski Thanks for pointing us to that repro example advice!

Generalized commented 2 days ago

No, it does not work: I wrote "It failed." where it failed. Kindly please look at the whole post when you return. I just showed two scenarios help you figure out what is going on. First scenario, using qdrg() which is flexible and allows me to do what I need to do - fails. Second scenarios, which does not use qdrg() and not flexible - works.

rvlenth commented 1 day ago

From the help for qdrg:

If ordinal is provided, the intercept terms are modified appropriate to predicting an ordinal response, as described in vignette("models"), Section O, using ordinal$mode as the mode argument (if not provided, "latent" is assumed). (All modes are supported except 'scale')

Moreover,

For this to work, we expect the first ordinal$dim - 1 elements of coef to be the estimated threshold parameters, followed by the coefficients for the linear predictor.

This means that somehow you must rearrange the coefficients and the formula so that the ordinal thresholds come first, followed by the coefficients for the linear predictor implied by formula. The name of the ordinal response should not be included in the formula.

[I wrote this on July 5, but realized I didn't click the "comment" button]

rvlenth commented 1 day ago

Kindly please look at the whole post when you return.

Just the one comment with the example is 277 lines long, still over 250 lines without the data. I'd say you are asking for a lot. It's not like anybody is paying me to do this. It's not like you paid anything for this software. I am 5 years younger than the guy that a lot of people are asking to step down for cognitive decline -- and unlike him, I am not looking for a big job.

The qdrg() function (for "quick and dirty reference grid") is offered as a possible solution for use with models that are not supported by emmeans. It works well in cases where you can give the coefficients and their covariance matrix, as well as a formula that can be used to reproduce the model matrix. There is a provision also for ordinal models and a clear statement is offered in the help as to how it expects things to be arranged for such models.

But qdrg() is not guaranteed to work for all models, and it could well be that this is one of them. Obviously, you have tried a lot of manipulations to try to make it work. If qdrg() does not work here, that is unfortunate, but this does not mean that there is anything wrong with the function. Really, this is your model, not mine, and I'm just not willing to try to navigate this long narrative of successes and failures. The fact that you seem to be wanting to estimate trends adds additional complications.

For a package that produces complex model objects, it is the developer of that package who would know what is needed to make predictions from that model. If that person is interested in providing emmeans support for those models, then that is what would help you. And documentation if provided that package developers can use to help set that up.

rvlenth commented 1 day ago

One thought. Consider making a list of reference grids for the imputations, rather than trying to make one. Presumably each of those RGs will have the same linfct slot, but different bhats and Vs, that you can combine together using Rubin's rules.