ccao-data / model-res-avm

Automated valuation model for all class 200 residential properties in Cook County (except vacant land and condos)
GNU Affero General Public License v3.0
20 stars 3 forks source link

Make comps count and price bins configurable #182

Closed jeancochrane closed 5 months ago

jeancochrane commented 5 months ago

This PR builds on top of #106 to define workflow config values for the number of comps and the number of price bins in which we group comps. Together, these config values allow us to speed up the comps pipeline when accuracy is not of primary importance (by decreasing num_comps and increasing num_comp_price_bins) or slow it down to get better accuracy (by increasing num_comps and decreasing num_comp_price_bins).

Log line evidence that this works, from a workflow run where num_comps = 10 and num_comp_price_bins = 20:

Getting top 10 comps for price bin 1/20 ($-2,147,483,647 to $115,011) - 8654/1098990 observations, 35965/359641 possible comps

Connects #41.