There will be some groups with exactly the same estimates, why did this happen?

When I use fmrs.mle to fit an FMR model, I find that sometimes there will be some groups with exactly the same estimates (coefficients, dispersion, and even mixProp). For example, here the data are generated by fmrs.gendata with 3 components.

set.seed(1980)
nComp = 3
nCov = 5
nObs = 300
dispersion = c(1, 1, 1)
mixProp <- c(1/3, 1/3, 1/3)
rho = 0 #  the correlation between covariates of design matrix
coeff1 = 2 * c(0,1,1,1,0,0) 
coeff2 = 2 * c(0,-1,-1,-1,0,0)
coeff3 = 2 * c(0,0,0,0,1,1)
umax = 40 # the upper bound in Uniform distribution for censoring
eg.dat0 <- fmrs.gendata(nObs = nObs, nComp = nComp, nCov = nCov,
                        coeff = c(coeff1, coeff2, coeff3), dispersion = dispersion,
                        mixProp = mixProp, rho = rho, umax = umax,
                        disFamily = 'norm') # The options are 'norm' for FMR models

Then I fit a FMR model res.fmr0 using fmrs.mle, fmrs.tunsel and fmrs.varsel. The result shows that, the 1st and 2nd (the same for 3rd and 4th) compnent has exactly the same estimates of coefficients, dispersion, and even mixProp.

I wonder why did this happen?

Does your algorithm have the function of adaptively selecting the number of groups? For example, there are 3 groups here in reality. When there are at most 4 groups (set nComp = 4, and nComp is just an upper bound of the order of FMR), the number of groups estimated by your algorithm is 2?

Or is it just because of the problem of unidentifiability, and has nothing to do with your algorithm implementation?

The result of res.fmr0 is

t(round(coefficients(res.fmr0), 3)) 

       Intercept    X.1    X.2    X.3   X.4   X.5
Comp.1         0 -1.975 -2.134 -1.684 0.423 0.000
Comp.2         0 -1.975 -2.134 -1.684 0.423 0.000
Comp.3         0  1.054  0.993  0.821 1.033 0.975
Comp.4         0  1.054  0.993  0.821 1.033 0.975

slot(res.fmr0, "dispersion") 

       Comp.1   Comp.2   Comp.3   Comp.4
[1,] 1.109152 1.109152 2.466677 2.466677

head(slot(res.fmr0, "weights"))

           Comp.1       Comp.2    Comp.3    Comp.4
[1,] 2.959903e-01 2.959903e-01 0.2040097 0.2040097
[2,] 3.784626e-08 3.784626e-08 0.5000000 0.5000000
[3,] 2.332525e-08 2.332525e-08 0.5000000 0.5000000
[4,] 4.978730e-07 4.978730e-07 0.4999995 0.4999995
[5,] 2.270094e-08 2.270094e-08 0.5000000 0.5000000
[6,] 3.014612e-01 3.014612e-01 0.1985388 0.1985388

slot(res.fmr0, "mixProp") 

        Comp.1    Comp.2    Comp.3    Comp.4
[1,] 0.1823885 0.1823885 0.3176115 0.3176115

The code for fitting FMR model is as following:

res.fmr0 <- fmr.ini(eg.dat0$x, eg.dat0$y, num_groups = 4) 

fmr.ini <- function(cova, response, num_groups){

  # Description: 
  #   Generating the fitted FMR model containing the initial values for N1.
  # Required preceding functions or packages: fmrs

  n = length(response)
  p = as.integer(dim(cova)[2])
  # Initial coefficients for fmr.mle (with an addtional intercept)
  # 初始值设得比较随意，前半部分是-1，后半部分是0
  b00 = c(rep(-1, (num_groups * (p + 1))/2 ),
          rep(1, (num_groups * (p + 1)) - floor((num_groups * (p + 1))/2)))
  # Set active set : eliminate intercept
  activeset <- matrix(c(rep(0, num_groups), rep(1, p * num_groups)), 
                      nrow = p + 1, ncol = num_groups, byrow = TRUE)
  res.mle <- fmrs.mle(y = response, delta = rep(0, n), x = cova,
                      nComp = num_groups, disFamily = "norm", 
                      initCoeff = b00, initDispersion = rep(1, num_groups), 
                      initmixProp = rep(1/num_groups, num_groups), 
                      activeset = activeset, nIterNR = 200 )
  res.lam <- fmrs.tunsel(y = response, delta = rep(0, n), x = cova,
                         nComp = num_groups, disFamily = "norm",
                         initCoeff = c(coefficients(res.mle)),
                         initDispersion = dispersion(res.mle),
                         initmixProp = mixProp(res.mle), penFamily = "adplasso",
                         activeset = activeset, nIterNR = 200 )
  res.var <- fmrs.varsel(y = response, delta = rep(0, n), x = cova,
                         nComp = num_groups, disFamily = "norm",
                         initCoeff = c(coefficients(res.mle)),
                         initDispersion = dispersion(res.mle),
                         initmixProp = mixProp(res.mle), penFamily = "adplasso", 
                         lambPen = slot(res.lam, "lambPen"),
                         activeset = activeset, nIterNR = 200 )
  return(res.var)
}

shokoohi / fmrs

There will be some groups with exactly the same estimates, why did this happen? #1