BayAreaMetro / bayarea_urbansim

Bay Area Version of the UrbanSim Model
http://bayareametro.github.io/bayarea_urbansim
13 stars 11 forks source link

Create consistent model runs #334

Closed theocharides closed 10 months ago

theocharides commented 10 months ago

The goal of this PR was to identify any causes of run-to-run variation in model results. While some BAUS sub-models are stochastic, when the random seed is set, model results should be identical.

The following cause of variation were found and remedied: BAUS reads in a json file that specifies parcel to mark as nodev. A nearest neighbor function was being used to translate between this file and the corresponding Parcel ID in the BAUS parcels table. This function sometimes returned the full set of locations from the json file but other times dropped most of them. Because the json table inherently contains a Parcel ID column, the nearest neighbor step was removed from the model run. It was confirmed that the json Parcel IDs and the nearest neighbor Parcel IDs match one another.

To test the changes, the run logs of five BAUS runs were compared. The logs were identical indicating that each BAUS model step was producing the same results.


akselx commented 10 months ago

Impressive you could pin it down to this step - and, interesting / surprising: So nearest neighbor returned different sets from run to run?

theocharides commented 10 months ago

@akselx yes it was mostly a digging exercise but luckily it popped up pretty quickly in the model run so it was fast to test and debug. I did not look into what was happening when nearest neighbor was dropping parcels yet however.

Will need to do some more runs now and just see if this solves it or if there are any other sources of inconsistency in the rest of the run.