estimating error in function pgenlogis

ellaninomiya commented 9 months ago

hi, I am using the pgenlogis and xxirt funcion to estimate generalized logistic item response model (with the fisher tranformation you suggested), and something strange happened. Using the same data, the estimating progress was fine on my own computer, but shows error as in if ((alpha1 > 0)) { :missing value where TRUE/FALSE needed, when I using another computer. This message indicates this parameter is NA. It is strange because I defined the parameter with xxirt_createDiscItem already. And when I set verbose = T in xxirt, this error jump out in the first iteration. When I increase the sample size to test this model, this error shows almost everytime. I am really confuesd and can't find where the problem is for two days. Thank you!

alexanderrobitzsch commented 9 months ago

Please provide reproducible script and data.

ellaninomiya commented 9 months ago

hi, I upload the data as Rdata in this zip file. (Sample Size is 5000 and Test Length is 20) dat.zip And the script is

P_PGenlogis <- function( par, Theta, ncat){
  a <- par[1]
  b <- par[2]
  talpha1 <- par[3]
  talpha2 <- par[4]
  TP <- nrow(Theta)
  P <- matrix( NA, nrow=TP, ncol=ncat)
  P[,2] <- sirt::pgenlogis((a*(Theta[,1]- b)), alpha1=(exp(2*talpha1)-1)/(exp(2*talpha1)+1), alpha2=(exp(2*talpha2)-1)/(exp(2*talpha2)+1))
  P[,1] <- 1 - P[,2]
  return(P)
}

par <- c("a" = 1, "b" = 0, "talpha1" = 0, "talpha2" = 0)
item_1PGenlogis <- xxirt_createDiscItem( name = "1PGenlogis" , par = par, est = c(FALSE,TRUE,TRUE,TRUE), P = P_PGenlogis)
item_2PGenlogis <- xxirt_createDiscItem( name = "2PGenlogis" , par = par, est = c(TRUE,TRUE,TRUE,TRUE), P = P_PGenlogis)
customItems <- list(item_1PGenlogis, item_2PGenlogis)

mod.2PGenlogis.estimate <- function(dat, j){
  itemtype = rep( "2PGenlogis", j)
  partable = xxirt_createParTable(dat, itemtype = itemtype, customItems = customItems)
  parindex1 <- partable[ partable$parname=="talpha1","parindex"]
  parindex2 <- partable[ partable$parname=="talpha2","parindex"]
  lambda <- 1
  penalty_fun_item <- function(x){
    val <- (sum(sapply(x[parindex1], "-", x[parindex1])^2) + sum(sapply(x[parindex2], "-", x[parindex2])^2)) * lambda
    return(val)
  }
  smod2.2PGenlogis <- xxirt( dat=dat, Theta=Theta, partable=partable, maxit=3000, customItems=customItems, customTheta=customTheta, conv = 0.001, penalty_fun_item=penalty_fun_item, verbose = F)
  return(smod2.2PGenlogis)
}

Theta <- matrix( seq(-6,6,len=61) , ncol=1 )
P_Theta1 <- function( par , Theta , G){
  mu <- par[1]
  sigma <- max( par[2] , .01 )
  TP <- nrow(Theta)
  pi_Theta <- matrix( 0 , nrow=TP , ncol=G)
  pi1 <- dnorm( Theta[,1] , mean = mu , sd = sigma )
  pi1 <- pi1 / sum(pi1)
  pi_Theta[,1] <- pi1
  return(pi_Theta)
}

par_Theta <- c( "mu"=0, "sigma" = 1 )  
customTheta  <- xxirt_createThetaDistribution( par=par_Theta , est=c(F,T), P=P_Theta1 )

set.seed(2)
mod.2PGenlogis <- mod.2PGenlogis.estimate(dat = dat, j = 20)

And when I estimated this data without regularized estimation, same error shows.

mod.2PGenlogis.estimate <- function(dat, j){
  itemtype = rep( "2PGenlogis", j)
  partable = xxirt_createParTable(dat, itemtype = itemtype, customItems = customItems)
  smod1.2PGenlogis <- xxirt( dat=dat, Theta=Theta, partable=partable, maxit=10000, customItems=customItems, customTheta=customTheta, conv = 0.001, verbose = T)
  return(smod1.2PGenlogis)
}

alexanderrobitzsch commented 9 months ago

A slightly more condensed script would be more helpful in order to focus on the primary issue. I do want to dive into your mod.2PGenlogis.estimate function. Could you please ensure that the dimensions of itemtype and dat are properly chosen?

ellaninomiya commented 9 months ago

This data is simulated using the generation process of the second simulation study mentioned in the atricle [An Alternative to the 3PL: Using Asymmetric Item Characteristic Curves to Address Guessing Effects - Lee - 2018 - Journal of Educational Measurement - Wiley Online Library](https://onlinelibrary.wiley.com/doi/10.1111/jedm.12165), expect that I rearrange item order from easier to moer difficult. According to this generation process, I think it might be some multidimensional problem.(This estimation focuses on unidimensional and dichotomous item) The generation model from this article is : And I followed your excellent work in [Information | Free Full-Text | Regularized Generalized Logistic Item Response Model (mdpi.com)](https://www.mdpi.com/2078-2489/14/6/306) with fisher transformation as: (hope I didn't get these wrong)

alpha1=(exp(2*talpha1)-1)/(exp(2*talpha1)+1), alpha2=(exp(2*talpha2)-1)/(exp(2*talpha2)+1)

and regularized estimation with lambda = 1 as:

partable = xxirt_createParTable(dat, itemtype = itemtype, customItems = customItems)
parindex1 <- partable[ partable$parname=="talpha1","parindex"]
parindex2 <- partable[ partable$parname=="talpha2","parindex"]
lambda <- 1
penalty_fun_item <- function(x){
   val <- (sum(sapply(x[parindex1], "-", x[parindex1])^2) + sum(sapply(x[parindex2], "-", x[parindex2])^2)) * lambda
   return(val)
  }

The full simulation script is quite long and comments are kind of messy to read, so I have to appolzise to take some time to tidy it. I will send it as soon as I finish organize the comments. But all the script lines involving the mod.2PGenlogis.estimate function are posted. Thank you！

alexanderrobitzsch commented 9 months ago

I played a bit with script and data. I strongly recommend using lower and upper bounds for talpha1 and talpha2 (e.g., -4 and 4).

ellaninomiya commented 9 months ago

It finally works! And adding lower and upper bounds doesn't cost much extra time to converge. Thank you so much for your responsiveness and time!

alexanderrobitzsch commented 9 months ago

As an alternative, one can prevent numerical overflow in the inverse Fisher transformation (exp(2*talpha)-1)/(exp(2*talpha)+1). This formula can be safely used for talpha <= 0. For talpha > 0, one can compute (1-exp(-2*talpha)-1)/(1+exp(-2*talpha)).

alexanderrobitzsch / sirt

estimating error in function pgenlogis #21