SE and GPCM - Githubissues

beamgau commented 10 years ago

Hi Phil,

i tried to post this to the google-group but somehow my posts doesnt get posted :(

i have some troubles estimating SE in a GPCM where estimation of the information matrix fails. i have no idea, i don't think the model is somewhat uncommon and i use a pars-table. can you have a look at this?

thanks a lot, felix

philchalmers commented 10 years ago

Hi Felix,

Re: the Google forum, I think you were attaching files into a new topic, which to Google is probably a security hazard (might be an automatic attack trying to make people download malicious binary files). They are currently pending, but since you opened this here instead I'll just discard them.

I think the reason the information matrix isn't PD is because the parameters in pars are quite different from those in the ML estimate (your second mirt(...) run). Some of the intercepts are as far as -13, which are huge, and probably are causing numerical difficulties when trying to determine their 'curvature' about the log-likelihood; which is what the information matrix is.

The ML estimate is better, but looking at the condition of the information matrix it's clear even that is pretty unstable (higher values indicate more instability when you invert the matrix). Additionally, there are some really sparse categories in the data, which will also contribute to this issue (use lapply(df, table))). Make sense?

Phil

beamgau commented 10 years ago

mh, ok, i see. however, i only try to shift the model over the theta-continuum by fixing mean and sd beforehand and i wouldn't have expected that this effects the calculation of SEs. could providing start values or something similar help to circumvent that issue? or, maybe, are SEs the same regardless of prior mean/sd?

best, felix

edit: uh, i found an error, i guess. d0 probably should NOT be estimated in a GPCM, but fixed to zero, right? When i change that, the SEs can be calculated for most of the SE.types. Anyway, how would i want to transform those CIs to the IRTpars parameter? Can i just put the lower bound into the transformation formula we discussed in the googlegroup?

philchalmers commented 10 years ago

I was going to say that the two models don't have the same number of parameters, but it looks like you figured out why :). I don't think they will transform the same any more since the latent variable parameters now become part of the transformation, and some kind of rescaling needs to be done. Working out exactly how to transform them is tricky but should be possible.

beamgau commented 10 years ago

Thanks a lot. I am not sure whether I made clear what I want to transform… would it be possible to get CIs for the b1, b2, b3 parameter?

coef(x)[[1]]
           a1 ak0 ak1 ak2 ak3 d0    d1     d2     d3
par     1.935   0   1   2   3  0 1.189 -1.564 -6.259
CI_2.5  1.463  NA  NA  NA  NA NA 0.811 -2.144 -7.707
CI_97.5 2.407  NA  NA  NA  NA NA 1.567 -0.983 -4.812

coef(x, IRTpars = TRUE)[[1]]
        a     b1    b2    b3
par 1.935 -0.615 1.423 2.427

Thanks a lot, Felix

Von: Phil Chalmers [mailto:notifications@github.com] Gesendet: Montag, 5. Mai 2014 18:58 An: philchalmers/mirt Cc: Fischer, Felix Betreff: Re: [mirt] SE and GPCM (#32)

I was going to say that the two models don't have the same number of parameters, but it looks like you figured out why :). I don't think they will transform the same any more since the latent variable parameters now become part of the transformation, and some kind of rescaling needs to be done (similar to how the Rasch model is transformed). Working out exactly how to transform them is tricky but should be possible.

— Reply to this email directly or view it on GitHubhttps://github.com/philchalmers/mirt/issues/32#issuecomment-42210161.

philchalmers commented 10 years ago

Not easily. You would need to transform the standard errors via the delta method first to get some approximation, then build the CIs from that. It's probably just easier to roll your own custom item with the original gpcm parametrization using createItem() and estimation the information matrix with the 'BL' method. It's going to be slow to do it that way if you don't supply a gradient function, but might be the best work around to get the exact CIs.

beamgau commented 10 years ago

Ok, that is helpful advice, thanks a lot! I managed to write a item_function with the other parametrization, but it seems too picky about the seed. Could you have a brief look over this code:

library(mirt)
set.seed(1223)
name <- 'gpcm2'
par <- c(a = 1.5, b1 = -1, b2 = 0, b3 = 1)
est <- c(TRUE, TRUE, TRUE, TRUE)

P.gpcm2 <- function(par,Theta, ncat){
    a <- par[1]
    b1 <- par[2]
    b2 <- par[3]
    b3 <- par[4]
    P1 <- exp(1.7 * a * (Theta - b1))/(1 +  exp(1.7 * a * (Theta - b1)))
    P2 <- exp(1.7 * a * (Theta - b2))/(1 +  exp(1.7 * a * (Theta - b2)))
    P3 <- exp(1.7 * a * (Theta - b3))/(1 +  exp(1.7 * a * (Theta - b3)))
    cbind(1-P1, P1 - P2, P2 - P3, P3)
}

x <- createItem(name, par=par, est=est, P=P.gpcm2)

dat <- Science
mod <- mirt(dat, 1, "gpcm2", customItems=list(gpcm2=x), verbose = TRUE, SE = TRUE, SE.type = "complete")

philchalmers commented 10 years ago

Your code looked fine, but be sure to bound the returned probability to be within some reasonable tolerance. The log of the probabilities is taken during estimation, so you really want 1e-10 <= P <= (1 - 1e-10), otherwise log(P) will fail.

beamgau commented 10 years ago

How can i do so? Inserting

 P1 = ifelse(P1 < 0.0001, 0.0001, P1)
    P1 = ifelse(P1 > 0.9999, 0.9999, P1)
    P2 = ifelse(P2 < 0.0001, 0.0001, P2)
    P2 = ifelse(P2 > 0.9999, 0.9999, P2)
    P3 = ifelse(P3 < 0.0001, 0.0001, P3)
    P3 = ifelse(P3 > 0.9999, 0.9999, P3)

made everything worse :(

philchalmers commented 10 years ago

That's too high, you can be much more stringent (1e-10 is pretty good).

P1 <- exp(1.7 * a * (Theta - b1))/(1 + exp(1.7 * a * (Theta - b1)))
P2 <- exp(1.7 * a * (Theta - b2))/(1 + exp(1.7 * a * (Theta - b2)))
P3 <- exp(1.7 * a * (Theta - b3))/(1 + exp(1.7 * a * (Theta - b3)))
ret <- cbind(1-P1, P1 - P2, P2 - P3, P3)
ret <- ifelse(ret > (1-1e-10), (1-1e-10), ret)
ret <- ifelse(ret < 1e-10, 1e-10, ret)
return(ret)

On Tue, May 6, 2014 at 10:53 AM, beamgau notifications@github.com wrote:

How can i do so? Inserting

P1 = ifelse(P1 < 0.0001, 0.0001, P1) P1 = ifelse(P1 > 0.9999, 0.9999, P1) P2 = ifelse(P2 < 0.0001, 0.0001, P2) P2 = ifelse(P2 > 0.9999, 0.9999, P2) P3 = ifelse(P3 < 0.0001, 0.0001, P3) P3 = ifelse(P3 > 0.9999, 0.9999, P3)

made everything worse :(

— Reply to this email directly or view it on GitHubhttps://github.com/philchalmers/mirt/issues/32#issuecomment-42312540 .

philchalmers commented 10 years ago

This is a good exercise btw, probably worth opening a topic in the forum and just point to this issue :-)

beamgau commented 10 years ago

I agree. by constructing the item-function i understood a lot more. :+1: however, my function and "gpcm" result in quite different values, is this reasonable?

library(mirt)
set.seed(1223)
name <- 'gpcm2'
par <- c(a = 1.5, b1 = -1, b2 = 0, b3 = 1)
est <- c(TRUE, TRUE, TRUE, TRUE)

P.gpcm2 <- function(par,Theta, ncat){
    a <- par[1]
    b1 <- par[2]
    b2 <- par[3]
    b3 <- par[4]
    P1 <- exp( 1.7 * a * (Theta - b1))/(1 +  exp(1.7 * a * (Theta - b1)))
    P2 <- exp( 1.7 *  a * (Theta - b2))/(1 +  exp(1.7 * a * (Theta - b2)))
    P3 <- exp( 1.7 * a * (Theta - b3))/(1 +  exp(1.7 * a * (Theta - b3)))
    ret <- cbind(1-P1, P1 - P2, P2 - P3, P3) 
    ret <- ifelse(ret > (1-1e-10), (1-1e-10), ret) 
    ret <- ifelse(ret < 1e-10, 1e-10, ret) 
    return(ret) 

}

x <- createItem(name, par=par, est=est, P=P.gpcm2)

dat <- Science
mod <- mirt( dat, 1, "gpcm2", customItems=list(gpcm2=x), SE = TRUE, SE.type = "complete")
mod2 <- mirt( dat, 1, "gpcm", SE = TRUE, SE.type = "complete")

coef(mod)[[1]]
coef(mod2, IRTpars = TRUE)[[1]]

philchalmers commented 10 years ago

That's very interesting, you got a lower log-likelihood doing it this way than either mirt or ltm::gpcm (which are equivalent). I wonder why..... I'll have to look into why this is happing since it's interesting (the likelihood is exactly the same as a graded response model, and that is very strange indeed).

philchalmers commented 10 years ago

Oh, silly me, this is the graded response model not the generalized partial credit model. That's why!

philchalmers commented 10 years ago

You want this one: https://www.psych.umn.edu/psylabs/catcentral/pdf%20files/da03-01.pdf

beamgau commented 10 years ago

ups. i have some difficulties translating the mathematical notation :P i'll try if i can come up with a correct version. thanks a lot

beamgau commented 10 years ago

Ok, i was able to write a very slow, but probably correct gpcm. Thanks a lot!

library(mirt)
set.seed(1223)
name <- 'gpcm2'
par <- c(a = 1.5, b1 = -1, b2 = .3, b3 = 1)
est <- c(TRUE, TRUE, TRUE, TRUE)

summation = function(upto, Theta = seq(-4,4,.1), a, b){
  x = rep(0,length(Theta))
  if (upto > 0) x = apply(sapply(1:upto, function(x) (a*(Theta - b[x]))),1,sum)
  return(exp(x))
}

P.gpcm2 <- function(par,Theta, ncat){
    a <- par[1]
    b <- par[2:length(par)]
    nominator = sapply(0:(ncat-1), summation, Theta = Theta, a = a, b = b)
    denominator = apply(nominator,1,sum)
    ret <- nominator/denominator
    ret <- ifelse(ret > (1-1e-10), (1-1e-10), ret) 
    ret <- ifelse(ret < 1e-10, 1e-10, ret) 
    return(ret) 
}

x <- createItem(name, par=par, est=est, P=P.gpcm2)

dat <- Science
mod <- mirt( dat, 1, "gpcm2", customItems=list(gpcm2=x), SE = TRUE, SE.type = "complete")

philchalmers commented 10 years ago

Looks good to me :). It would be tones faster if you supplied the gradient function as well, but if speed isn't really an issue then this is fine and should work for a large number of problems. Cheers.

beamgau commented 10 years ago

Great, feels like i accomplished something :)

Von: Phil Chalmers [notifications@github.com] Gesendet: Dienstag, 6. Mai 2014 20:39 An: philchalmers/mirt Cc: Fischer, Felix Betreff: Re: [mirt] SE and GPCM (#32)

Looks good to me :). It would be tones faster if you supplied the gradient function as well, but if speed isn't really an issue then this is fine and should work for a large number of problems. Cheers.

— Reply to this email directly or view it on GitHubhttps://github.com/philchalmers/mirt/issues/32#issuecomment-42341736.

philchalmers / mirt

SE and GPCM #32