ValueError: dimension mismatch

saschagobel commented 2 years ago

Dear Giora,

Stunning work on combining neural networks with mixed effects models! I'm interested in applying this to a binary classification task.

Since I couldn't find an example that uses mode 'glmm' among your notebooks, I worked through the imdb.ipynb notebook with the following adjustments added throughout the script.

# make y binary
imdb = imdb.assign(
    score = lambda dataframe: dataframe['score'].map(lambda score: 1 if score >= 7 else 0) 
)

# specify mode
mode = 'glmm'

# Model adjustments, though this is not critical here
y_pred_output = Dense(1, activation = 'sigmoid')(out_hidden) 

optimizer = keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer= optimizer)

Running the notebook, I receive the following error:

ValueError: dimension mismatch

I think the problem stems from the calculation of 'b_hat'. In 'calc_b_hat.py', lines 99 to 129, 'b_hat' seems to be calculated only for 'z0', not for 'z1' as well.

Then in 'nn.py', lines 532 to 533 produce the error message:

y_pred = model.predict([X_test[x_cols], dummy_y_test] + X_test_z_cols).reshape(
                X_test.shape[0]) + Z_test @ b_hat

Since qs > 1 (2 in this case), Z_test combines the levels/categories of both z0 and z1. However, the length of b_hat is qs[0], i.e., the levels/categories in z0. The length of b_hat should be qs[0]+qs[1], no? Hence the dimension mismatch.

Should lines 99 to 129 in 'calc_b_hat' be adjusted to include an outermost loop through range(qs), then in the end stacking the b_hats of both variables?

Appreciate any help on this! Do you happen to have a working example with mode 'glmm'?

gsimchoni commented 2 years ago

Thanks @saschagobel. If I understand correctly you're trying to use LMMNN for binary classification with more than 1 categorical feature to be used as a random effect variable? If so, I'm afraid LMMNN currently only supports a single high-cardinality categorical feature in a classification setting (glmm). See our paper. The reason being this is a relatively easy scenario (the exact likelihood can be approximated with Gauss-Hermite quadrature in the spirit of LMMNN). I'm currently working on a different project, but perhaps later I will resume the work on LMMNN. Improving its classification setting abilities would definitely be the first thing to look into.

saschagobel commented 2 years ago

Oh, I see, that's unfortunate.

Could you give this issue a 'feature request' label and maybe point at some directions for how to implement the necessary changes? There may be others interested in binary classification with more than one random effect variable who would like to contribute (this exceeds my expertise). Thanks!

gsimchoni commented 2 years ago

Sure, I labelled the issue. To anyone interested, our work is indeed more relevant to a regression setting, and what we did in a classification setting was more a proof of concept. LMMNN uses exact negative log-likelihood (NLL) as a loss function optimized with SGD, and in a regression setting it can be explicitly written/computed, while in a classification this is more of a challenge due to unsolvable integrals. If one wishes to extend this work to classification settings or non-Gaussian distributions in general ("GLMMNN"), I suspect they would probably need to go through other optimization schemes, e.g. variational inference or sampling. I really hope I'll get there, currently I'm in the middle of a different project for my PhD.

gsimchoni / lmmnn

ValueError: dimension mismatch #14