UtrechtUniversity / autumn-fair

Follow up on Summer-Fair.
MIT License
1 stars 0 forks source link

Flatten _validation.toml #2

Open chStaiger opened 1 month ago

chStaiger commented 1 month ago

Reading in the validation.toml leads to a very nested dictionary and list of dictionaries. We need to discuss how to flatten that a bit. @StefanoRapisarda can you please have a look in the code how I am using the toml and give some suggestions how to flatten the data structure a bit?

chStaiger commented 1 month ago

Here is how I read in the toml file: https://github.com/UtrechtUniversity/autumn-fair/blob/200f234f2fb1496f195cc1fa515411c114134076/src/check_data.py#L95

And this is the code to get the information into a workable python data structure: https://github.com/UtrechtUniversity/autumn-fair/blob/200f234f2fb1496f195cc1fa515411c114134076/src/check_data.py#L12

StefanoRapisarda commented 1 month ago

I made a new version of metadata for validation ("_validation_schema_v2.toml") and renamed the old one (v1).

This is an example of the v2 file structure:

The levels are --> --> . So to access the type of the host_id column in the events file you should use metadata["events"]["host_id"]["type"]

Not all the columns have all the possible attributes, so I added the complete list of possible attributes at the very beginning of the file (metadata_keys). Compared to the previous version, this structure removes a layer, with the caveat that now you need to check for keywords listed in metadata_keys