Dealing with estimation datasets within the orca framework

psrc / urbansim2

3 stars 0 forks source link

Dealing with estimation datasets within the orca framework #103

Open hanase opened 6 years ago

hanase commented 6 years ago

Figure out how a variable or model defined for a dataset, say persons, can be used on a dataset of different name, say persons_for_estimation.

Maybe via injecting that alternate dataset under the original name. Or, joining the two datasets?

stefancoe commented 6 years ago

I wonder if it makes sense to have separate versions of datasources.py & models.py specifically for estimation and simulation. The reason being that datasources.py imports all the data, which will be different depending on whether we are running in estimation or simulation mode. I think this would be analogous to having two different configuration xmls in Urbansim 1.

stefancoe commented 6 years ago

Peter is going to write up a description of how the lag variables are created, so we'll have a better understanding of how the estimation data set differs from one used for simulation.

hanase commented 6 years ago

Thanks Stefan! I wonder how much code duplication this would require. But yes, for the estimation we need to figure out how to use lag variables (in both, simulation and estimation) before doing a big surgery.

hanase commented 6 years ago

@stefancoe - the estimation of REPM (and probably other models) is not working due to missing datasets. It's because these datasets are in the 2000 cache and not in 2014. I suggest to consolidate the estimation cache into two years:

2014 has all base year tables and datasets for estimation
2009 has:
- lag households table of all HHs + HHs for estimation with their previous locations
- lag buildings table

I can work on it.