Metastring / HealthHeatMap

0 stars 0 forks source link

Aggregation Methods #7

Open Varnita-Metastring opened 4 years ago

Varnita-Metastring commented 4 years ago

Build aggregation methods based on the SDG indicator framework based on the annotations of the data

asdofindia commented 4 years ago

Here's a possible implementation of a specific narrow subset of aggregations. That is, a composite indicator.

Imagine an indicator that is made up of other indicators. For example, let us define "maternal health index" as 1 / ("maternal mortality rate" + "perinatal mortality rate" + "proportion of women receiving Iron Folic Acid")

This is an indicator that we are defining out of existing indicators. The values have to be computed from the other indicators. We could either pre-compute them and store or compute them dynamically every time someone requests that particular indicator.

Either way, we will need to store the definition of this composite indicator in some machine readable form. For example, we could store the above like so:

{"divide": [1, {"sum": [ "maternal mortality rate", "perinatal mortality rate", "proportion of women receiving Iron Folic Acid"]}]

In the above we are representing the formula in a machine readable form where divide is an operator that takes an array of [numerator, denominator] and sum is another operator that takes an array [...values to be summed]. Here we've simplified the representation of "maternal mortality rate" and others. This should be unique references to the corresponding indicators in our database rather than a string.

With this representation of the formula, we can compute the values (either for pre-computation or for dynamic computation).

The way to do that would be to build a formula parser. This formula parser should be able to apply operations like divide and sum. It should also be able to replace the references to other indicators with values that we supply.

Now, when calling this formula parser, we need to supply the values for these indicators. This could be done by querying the database for those values (in a single query that finds all the values required).

Alternatively, we could implement the calculation within the database. Elasticsearch, for example, supports painless scripting language which can be used inside a bucket to do this calculation. But, in that case, we would have to express the formula in database specific languages (or convert the formula into database specific form).

asdofindia commented 4 years ago

We could use something like formulize (demo) (@deepkt may want to build something on his own :D) to capture such formula from the frontend.