nismod / smif

Simulation Modelling Integration Framework
http://www.itrc.org.uk
MIT License
22 stars 6 forks source link

smif prepare-convert will ignore combined datasets #389

Open tomalrussell opened 5 years ago

tomalrussell commented 5 years ago

An unintended feature of our method for reading data arrays from CSVs is that multiple data variables can be stored in extra columns in a single file.

E.g. population and GVA might share a region dimension and be defined over the same timesteps, so a CSV with timestep,region,pop,gva as a header could be read to load a pop data array or a gva data array.

The smif prepare-convert command reads all data arrays associated with a model run and writes them to parquet, one by one. When a CSV file contains more than one data array, the corresponding parquet file will be written twice or more, and will only contain the last data array to be read and re-written.

Approaches: