dined-io / dyned

BSD 3-Clause "New" or "Revised" License
0 stars 2 forks source link

Export the DINED database in standard formatted csv files and sharing those on 4TU repository. #21

Closed toonhuysmans closed 2 years ago

toonhuysmans commented 2 years ago

This would allow us to work with the dined data more easily for the time being. Once we have an open API to approach the DINED database, this would not be necessary anymore.

jurra commented 2 years ago

How could the dataset look like?

How should a study look like?

Each row is an individual:

| individual_id | sex | units | age | study_id | measure_n  | measure_n+1 |  measure_n_unit

Ideas and options:

  1. Separate each study with its name and metadata (can in a folder or at the root)
  2. Separate files for the studies one for the metadata, here we refer to metadata as measures description.
  3. There is a description about the study, this can be also a separate metada or summary file. (currently is in json, perhaps this could be turned into more of a readme style kind of format)

measures.json could be a specification, perhaps a dined specification. Think about this as a specification that could be described with json schema for instance.

measure_n could be a type as well, for instance we could do type checking for certain measures like the type can only be positive or a float, and it needs to be of a certain class like mm???

jurra commented 2 years ago

This has been solved as well as the generation of metadata that describes the set of standardized files as a data-package using frictionless standards and frictionless framework.