Metastring / HealthHeatMap

0 stars 0 forks source link

Database schema for health data #9

Closed Varnita-Metastring closed 4 years ago

Varnita-Metastring commented 4 years ago

Design and build database schema to upload health data with a flexible column annotations for framework and metadata

asdofindia commented 4 years ago

Blocked by #3

At the moment, I think that the following kind of mongo schema can hold the data collection.

{
  dataset: <oid of dataset in dataset collection>,
  entity: <oid of entity in entity collection>,
  indicator: <oid of indicator in indicator collection>,
  value: <value of the indicator for this entity in this dataset>
}

The three collections that this will depend on are:

1. datasets collections

{
  "name": <Name of the dataset>,
  "year": <year in which it was released>,
  "sourceUrl": <most accurate url to this dataset online>,
  "metadata URL": <URL that talks about this dataset>,
  "parent": <oid of parent dataset (if any)>
}

A more complicated metadata schema is used by worldbank microdata repository and we can learn from it as we go.

See sample: https://microdata.worldbank.org/index.php/metadata/export/2949/json

2. entity collection

{
  name: "Canonical name of entity",
  alternative_names: "array of alternative names" (optional),
  belongs_to: <array of oid of entities this entity "belongs" to in some way (geographical hierarchy)>
  contains: <array of oid of entities that "belongs" to this entity>,
  shape: <geojson of this entity>
}

3. indicator collection

This holds information about indicator

{
  name: <Name of the indicator>,
  (can have further fields to be able to group indicators together)
}
asdofindia commented 4 years ago

From internal discussions, this idea has been dropped in favour of storing datasets as tables (that resemble original spreadsheet files). The additional complexities that introduce in querying will be solved (presumably) by using ElasticSearch.

Shall document that as it evolves into a mature schema.

asdofindia commented 4 years ago

Closing this in favour of #13