shokoohi / fmrs

fmrs package provides estimation and variable selection in Finite Mixture of Accelerated Failure Time Regression (FMAFTR) and Finite Mixture of Regression (FMR) models.
3 stars 2 forks source link

There will be some groups with exactly the same estimates, why did this happen? #1

Open wanmeijianchi opened 2 years ago

wanmeijianchi commented 2 years ago

When I use fmrs.mle to fit an FMR model, I find that sometimes there will be some groups with exactly the same estimates (coefficients, dispersion, and even mixProp). For example, here the data are generated by fmrs.gendata with 3 components.

set.seed(1980)
nComp = 3
nCov = 5
nObs = 300
dispersion = c(1, 1, 1)
mixProp <- c(1/3, 1/3, 1/3)
rho = 0 #  the correlation between covariates of design matrix
coeff1 = 2 * c(0,1,1,1,0,0) 
coeff2 = 2 * c(0,-1,-1,-1,0,0)
coeff3 = 2 * c(0,0,0,0,1,1)
umax = 40 # the upper bound in Uniform distribution for censoring
eg.dat0 <- fmrs.gendata(nObs = nObs, nComp = nComp, nCov = nCov,
                        coeff = c(coeff1, coeff2, coeff3), dispersion = dispersion,
                        mixProp = mixProp, rho = rho, umax = umax,
                        disFamily = 'norm') # The options are 'norm' for FMR models

Then I fit a FMR model res.fmr0 using fmrs.mle, fmrs.tunsel and fmrs.varsel. The result shows that, the 1st and 2nd (the same for 3rd and 4th) compnent has exactly the same estimates of coefficients, dispersion, and even mixProp.

I wonder why did this happen?

Does your algorithm have the function of adaptively selecting the number of groups? For example, there are 3 groups here in reality. When there are at most 4 groups (set nComp = 4, and nComp is just an upper bound of the order of FMR), the number of groups estimated by your algorithm is 2?

Or is it just because of the problem of unidentifiability, and has nothing to do with your algorithm implementation?

The result of res.fmr0 is

t(round(coefficients(res.fmr0), 3)) 

       Intercept    X.1    X.2    X.3   X.4   X.5
Comp.1         0 -1.975 -2.134 -1.684 0.423 0.000
Comp.2         0 -1.975 -2.134 -1.684 0.423 0.000
Comp.3         0  1.054  0.993  0.821 1.033 0.975
Comp.4         0  1.054  0.993  0.821 1.033 0.975

slot(res.fmr0, "dispersion") 

       Comp.1   Comp.2   Comp.3   Comp.4
[1,] 1.109152 1.109152 2.466677 2.466677

head(slot(res.fmr0, "weights"))

           Comp.1       Comp.2    Comp.3    Comp.4
[1,] 2.959903e-01 2.959903e-01 0.2040097 0.2040097
[2,] 3.784626e-08 3.784626e-08 0.5000000 0.5000000
[3,] 2.332525e-08 2.332525e-08 0.5000000 0.5000000
[4,] 4.978730e-07 4.978730e-07 0.4999995 0.4999995
[5,] 2.270094e-08 2.270094e-08 0.5000000 0.5000000
[6,] 3.014612e-01 3.014612e-01 0.1985388 0.1985388

slot(res.fmr0, "mixProp") 

        Comp.1    Comp.2    Comp.3    Comp.4
[1,] 0.1823885 0.1823885 0.3176115 0.3176115

The code for fitting FMR model is as following:

res.fmr0 <- fmr.ini(eg.dat0$x, eg.dat0$y, num_groups = 4) 

fmr.ini <- function(cova, response, num_groups){

  # Description: 
  #   Generating the fitted FMR model containing the initial values for N1.
  # Required preceding functions or packages: fmrs

  n = length(response)
  p = as.integer(dim(cova)[2])
  # Initial coefficients for fmr.mle (with an addtional intercept)
  # 初始值设得比较随意,前半部分是-1,后半部分是0
  b00 = c(rep(-1, (num_groups * (p + 1))/2 ),
          rep(1, (num_groups * (p + 1)) - floor((num_groups * (p + 1))/2)))
  # Set active set : eliminate intercept
  activeset <- matrix(c(rep(0, num_groups), rep(1, p * num_groups)), 
                      nrow = p + 1, ncol = num_groups, byrow = TRUE)
  res.mle <- fmrs.mle(y = response, delta = rep(0, n), x = cova,
                      nComp = num_groups, disFamily = "norm", 
                      initCoeff = b00, initDispersion = rep(1, num_groups), 
                      initmixProp = rep(1/num_groups, num_groups), 
                      activeset = activeset, nIterNR = 200 )
  res.lam <- fmrs.tunsel(y = response, delta = rep(0, n), x = cova,
                         nComp = num_groups, disFamily = "norm",
                         initCoeff = c(coefficients(res.mle)),
                         initDispersion = dispersion(res.mle),
                         initmixProp = mixProp(res.mle), penFamily = "adplasso",
                         activeset = activeset, nIterNR = 200 )
  res.var <- fmrs.varsel(y = response, delta = rep(0, n), x = cova,
                         nComp = num_groups, disFamily = "norm",
                         initCoeff = c(coefficients(res.mle)),
                         initDispersion = dispersion(res.mle),
                         initmixProp = mixProp(res.mle), penFamily = "adplasso", 
                         lambPen = slot(res.lam, "lambPen"),
                         activeset = activeset, nIterNR = 200 )
  return(res.var)
}
shokoohi commented 2 years ago

Hi This is not an issue. You are over-stating the order of the mixture. The data are generated from a mixture of 3-component but you are setting the order to 4. Therefore, in the best-case scenario, one of the components will split and the estimated coefficients are going to be similar. In other scenarios, either dispersion, mixing proportion, or the coefficients are going to be very small. There are techniques to penalize the order of mixture models such as BIC, AIC, or modern ones which are based on the lasso.