ccao-data / model-res-avm

Automated valuation model for all class 200 residential properties in Cook County (except vacant land and condos)
GNU Affero General Public License v3.0
26 stars 5 forks source link

Add logic for `num_iterations` usage #172

Closed dfsnow closed 8 months ago

dfsnow commented 8 months ago

Quick PR to clarify the behavior of the num_iterations hyperparameter and how it interacts with cross-validation. Basically, it can follow 3 paths:

  1. If CV AND early stopping are enabled, the training pipeline uses the upper bound of the CV search range (set in params.yaml) as the maximum possible number of iterations before stopping. CV could plausibly discover an optimal number of iterations anywhere between 0 and the max by using a holdout validation set
  2. If CV is enabled but early stopping is disabled, set the search range to the standard CV range specified in params.yaml. CV will iterate through the num_iterations range and test different values
  3. If no CV is enabled, use the default, static parameter value from params.yaml