Open Jeffrothschild opened 1 year ago
ML workflows with clustered data are a delicate thing. Using a clean train/test split (grouped split on subject) and then evaluating the model on the test data is often a good choice. Then you wont have this problem.
Hi, I'm wondering if it would be possible (or even make sense) to have the option to specify random effects in the model explainer?
I thought about this because when looking at feature importance, the full model RMSE is quite different to one that accounts for random effects. For example...