soichiroy / emlogit

ECM algorithm for the multinomial logit model
GNU General Public License v3.0
1 stars 1 forks source link

initial_value should be dimension K by J #5

Closed kuriwaki closed 4 years ago

kuriwaki commented 4 years ago

not J - 1, I think?

Here is a reprex with the existing version that shows that J does not work.

library(emlogit)
data(japan, package = "MNP")

# example ---
Y <- japan %>% select(LDP, NFP, SKG, JCP) %>% data.matrix()
X <- japan %>% select(gender, age) %>% data.matrix()

set.seed(1234)
fit <- emlogit(Y = Y, X = X)
dim(fit$coef)
#> [1] 3 4

# try with initial values ----
# 2 covariates + 1 intercept (rows) and 4 choices (columns)
init_mat <- matrix(c(0, 0.5, -0.2, 0.1,
                     0, 0.2,  0.2, 0.2,
                     0, 0.3,  0.3, 0.4),
                   byrow = TRUE, nrow = 3)

# these match. J - 1 does not seems necessary.
identical(dim(init_mat), dim(fit$coef))
#> [1] TRUE

# try it out. 
fit_2 <- emlogit(Y = Y, X = X, 
                 control = list(initial_value = init_mat))
#> Error in emlogit(Y = Y, X = X, control = list(initial_value = init_mat)): Dimension of initial value does not match.

Created on 2020-05-04 by the reprex package (v0.3.0)

but with the change in bfc88fe, this works.

library(emlogit)
data(japan, package = "MNP")

# example ---
Y <- japan %>% select(LDP, NFP, SKG, JCP) %>% data.matrix()
X <- japan %>% select(gender, age) %>% data.matrix()

set.seed(1234)
fit <- emlogit(Y = Y, X = X)
dim(fit$coef)
#> [1] 3 4

# try with initial values ----
# 2 covariates + 1 intercept (rows) and 4 choices (columns)
init_mat <- matrix(c(0, 0.5, -0.2, 0.1,
                     0, 0.2,  0.2, 0.2,
                     0, 0.3,  0.3, 0.4),
                   byrow = TRUE, nrow = 3)

# try it out. 
fit_2 <- emlogit(Y = Y, X = X, 
                 control = list(initial_value = init_mat))
fit_2$coef
#>      [,1]         [,2]         [,3]         [,4]
#> [1,]    0  0.505046601 -0.037569740 -0.503925933
#> [2,]    0 -0.171375636 -0.051898286  0.088159440
#> [3,]    0 -0.005205218 -0.003313376 -0.009005319

Created on 2020-05-04 by the reprex package (v0.3.0)

sou412 commented 4 years ago

Thanks. Merging this. Do you think it's better to add 0's internally rather than asking users to add them because for now, only the first column can be the baseline?

kuriwaki commented 4 years ago

I think making the baseline explicit throughout is good, though either is ok. To make sure there are no errors is to write in a check could be to ensure all first columns are equal to zero.