Add `id_column` to all algos, grid & AutoML for proper CV fold stratification

Let's add an id_column to all our algorithms, grid and AutoML functions. Right now, if you have pooled-repeated measures data (one ID/person/cluster contributes multiple rows to the training set), then the only way to guarantee that all rows belonging to a single ID will be associated with a single fold is to use the fold_column argument. If the fold partitioning is not stratified by ID, then we get data leakage across folds. The user-specified fold_column method requires the user to code the stratification-by-ID themselves, which is a pain.

Currently, there is a "Stratified" option in fold_assignment but that only stratifies by the response column (classification only) to ensure that you get an even number of each class in each fold.

When the id_column is specified, then this will automatically trigger stratification-by-id when cross-validation is used. Let's think about whether we want to force the user to also specify fold_assignment = "Stratified" as well, or if specifying the id_column should be enough. We will need to handle the case where id_column is specified and fold_column is set to something other than "AUTO" or "Stratified".

Notes:

id_column defaults to NULL/None
this column should be automatically excluded from the set of predictors, even if it's included in the x argument

A request for a more generic version of this (stratify on any column) exists here: https://0xdata.atlassian.net/browse/PUBDEV-1848

h2oai / h2o-3

Add `id_column` to all algos, grid & AutoML for proper CV fold stratification #11327