OpenSenseAction / OS_data_format_conventions

Code and example files to illustrate standard data formats and conventions derived the OpenSense Action
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

Make helper function to transform standard csv format to netcdf #16

Open bwalraven opened 3 months ago

bwalraven commented 3 months ago

Most users with CML data in csv's have a separate metadata file, and the actual CML data stored as one big file for the entire network (or possibly multiple csv files, one per time interval), or in multiple csv files separated by sublink.

Function The idea is to create a function that can help users to transform their csv files to netcdf, following OPENSENSE data formats. For this function it would be required that the metadata is in a standard format, and can be linked to the actual CML data. Putting the metadata into the standard format would be the user's own responsibility.

Metadata Proposed csv metadata format (includes all the required, recommended, and optional fields from the WG1 paper): metadata_format

This format relies on the sublink being the lowest level, i.e. each row refers to a sublink. The required parameters from the whitepaper will be hard coded. Recommended and optional parameters included in the whitepaper should also be optionally read in the function. Parameters not included in the white paper will not be supported in this function to limit the complexity.

CML data and coupling Consequently the function should include the coupling of the metadata to the CML data files, to create one large netcdf. We propose coupling of the metadata to CML data relies on the column 'couple_id' in the metadata file. So that if the sublink_id is in the file name of the CML data file, this couple_id could be used without having to change the names of many csv files. The function should support the two most common ways of coupling. 1) CML data stored in one file for the entire network, with possibly multiple files for different time intervals, and 2) data stored in separate csv files per sublink.

cchwala commented 3 months ago

For reference, this is related to https://github.com/OpenSenseAction/poligrain/issues/34