Open saschagobel opened 2 years ago
Thanks @saschagobel. If I understand correctly you're trying to use LMMNN for binary classification with more than 1 categorical feature to be used as a random effect variable? If so, I'm afraid LMMNN currently only supports a single high-cardinality categorical feature in a classification setting (glmm
). See our paper. The reason being this is a relatively easy scenario (the exact likelihood can be approximated with Gauss-Hermite quadrature in the spirit of LMMNN). I'm currently working on a different project, but perhaps later I will resume the work on LMMNN. Improving its classification setting abilities would definitely be the first thing to look into.
Oh, I see, that's unfortunate.
Could you give this issue a 'feature request' label and maybe point at some directions for how to implement the necessary changes? There may be others interested in binary classification with more than one random effect variable who would like to contribute (this exceeds my expertise). Thanks!
Sure, I labelled the issue. To anyone interested, our work is indeed more relevant to a regression setting, and what we did in a classification setting was more a proof of concept. LMMNN uses exact negative log-likelihood (NLL) as a loss function optimized with SGD, and in a regression setting it can be explicitly written/computed, while in a classification this is more of a challenge due to unsolvable integrals. If one wishes to extend this work to classification settings or non-Gaussian distributions in general ("GLMMNN"), I suspect they would probably need to go through other optimization schemes, e.g. variational inference or sampling. I really hope I'll get there, currently I'm in the middle of a different project for my PhD.
Dear Giora,
Stunning work on combining neural networks with mixed effects models! I'm interested in applying this to a binary classification task.
Since I couldn't find an example that uses mode 'glmm' among your notebooks, I worked through the imdb.ipynb notebook with the following adjustments added throughout the script.
Running the notebook, I receive the following error:
I think the problem stems from the calculation of 'b_hat'. In 'calc_b_hat.py', lines 99 to 129, 'b_hat' seems to be calculated only for 'z0', not for 'z1' as well.
Then in 'nn.py', lines 532 to 533 produce the error message:
Since qs > 1 (2 in this case), Z_test combines the levels/categories of both z0 and z1. However, the length of b_hat is qs[0], i.e., the levels/categories in z0. The length of b_hat should be qs[0]+qs[1], no? Hence the dimension mismatch.
Should lines 99 to 129 in 'calc_b_hat' be adjusted to include an outermost loop through range(qs), then in the end stacking the b_hats of both variables?
Appreciate any help on this! Do you happen to have a working example with mode 'glmm'?