thekingofkings / chicago-partition

Automatically partition Chicago into Community Areas (CA), while minize the CA level crime prediction error.
MIT License
1 stars 1 forks source link

Add house_price prediction task #9

Closed thekingofkings closed 6 years ago

thekingofkings commented 6 years ago

Motivation

We need other prediction task to evaluate our partition model. Ideally, different tasks require different partition boundary. And we should be able to explain the difference in the boundary from the prediction task data.

Design

  1. The evaluation requires test-train split. We find a temporal threshold to split the data into old and new. The average price of old is used to train the partition, while the new is used to test.
  2. Within each tract we need to keep track of total houses and sum of all unit price. Later we use sum to aggregate the CA level values, and the ratio between them is the average unit price.

Key points:

  1. Find a good temporal split, which ideally gives us train-test ratio 1:1. So that we don't need to worry about the sparseness in average house price.