Fair evaluation scheme - Githubissues

How to fairly evaluate the partition?

The train-test split is difficult, since spatial-wise we need all data to maximize the unsupervised objective. The only viable solution would be to split the data on the temporal dimensions.

Crime prediction

We use crime in year 2010 as training data to learn the optimal partition. Notice that during the MCMC process, we do not need leave one out error. We can simply use training fitness measure to search optimal partition.

Given the optimal partition, we use 2011 crime data to test.

House price

There is a sold date field in the house price dataset. We can split by certain date, and calculate two average house price (before and now).

Significance of the optimal partition

With the optimal partition, we can use permutation test to calculate the p-value. One permutation is defined as randomly select one tract and flip its CA assignment.

thekingofkings / chicago-partition

Fair evaluation scheme #7

How to fairly evaluate the partition?

Crime prediction

House price

Significance of the optimal partition

9 average house price features also follow the temporal training-testing split