centreformicrosimulation / SimPaths

SimPaths is an open-source microsimulation framework for life course analysis, developed and maintained by CeMPA at the University of Essex
European Union Public License 1.2
7 stars 17 forks source link

Clean-up of country variants #67

Open justin-ven opened 2 months ago

justin-ven commented 2 months ago

The current approach for distinguishing between countries appears fragile, and introduces computational difficulties. Two obvious examples are:

  1. Country parameters: At present, country specific parameters are defined in separate excel tabs of common excel files, distinguished by names ending in "_CC". This will become unmanageable if the number of countries increases even moderately.
  2. Database administration: We currently follow a pattern whereby country specific data provided in csv format are loaded into "preliminary" database tables, with countries distinguished by appending "_CC" to the end of the table name. Each simulation then proceeds by making a "live" copy of the desired country table which is saved with the same name but omitting "_CC". Model objects are designed to integrate with the live copies. This pattern seems highly inefficient, limiting pre-processing of input data and necessitating repeated read-write activity to disk.

Recommendations:

  1. Country parameters: Move to a pattern whereby country specific parameters are saved in dedicated subdirectories of the model "input" directory.
  2. Database administration: Make better use of relational database functionality. This could be done by adding object attributes to permit database filtering as desired (e.g. by adding country and year to household objects). Where the number of attributes is small, (e.g. just country and year) then these could be appended to the associated objects. More complex restrictions could be managed by adding new "administrator" objects.