ActivitySim / activitysim

An Open Platform for Activity-Based Travel Modeling
https://activitysim.github.io
BSD 3-Clause "New" or "Revised" License
189 stars 96 forks source link

Improve Data Model to Effectively Story, Query, Visualize, and QAQC core ABM Inputs and Outputs #728

Open joecastiglione opened 9 months ago

joecastiglione commented 9 months ago

Develop a simple data model that allow effectively storage, querying, visualizing, and QAQC core ABM inputs and outputs. The data model should be expandable to allow each agency adding in regional specific inputs and outputs

Roadmap text: Version 1.4 of ActivitySim began using pandera to validate input settings and Pydantic to validate model system inputs. In Version 1.5, the Consortium will publish a data model template that includes ActivitySim inputs, annotations, and outputs. The template will create documentation that will allow model owners to easily document the specifics of their model. Pydantic “property” methods will be used to replace some of the expressions that currently live in ActivitySim’s component “annotations”. In future releases, the data model will be integrated into the example models maintained by the Consortium.

May also consider Data Model Auditing Utility. To facilitate the integration of the data model into the ActivitySim model specifications, a utility will be created to audit an ActivitySim modeling system against a user defined data model. The tool will identify variables that are used in ActivitySim, but not defined in the data model. The goal of this tool is to motivate greater discipline around the use of variable names in model specifications, which should result in improved documentation — via the data model — of the modeling system.

Additional Description:

Partially complete since you can now optionally not save intermediate results to the pipeline for each submodel step. Additional pipeline oriented improvements can be incorporated into the reporting task.

joecastiglione commented 9 months ago

Agency Comments:

SANDAG: agreed. Maybe we can think about a data model with storage, effective querying, visualization, reporting and QAQC in mind (see item 39).