GSA / sdg-indicators-usa

U.S. National Reporting Platform for the Sustainable Development Goals
https://sdg.data.gov
Other
31 stars 92 forks source link

Datasets that need more aggregated columns #846

Open brockfanning opened 6 years ago

brockfanning commented 6 years ago

Looking forward to the possibility of showing disaggregations on indicators, I notice some datasets need additional aggregated columns. These datasets contain the disaggregated values (which is a great start!) but may be missing some overall sum/average/etc columns. Here is the list I came up with:

  1. https://sdg.data.gov/2-2-2/
    • includes wasting and overweight, but needs an "all" column
  2. https://sdg.data.gov/3-7-2/
    • includes ages 10-14 and 15-19, but needs an "all" column
  3. https://sdg.data.gov/4-1-1/
    • needs these columns:
      • an "all" column
      • a "reading_all" column
      • a "math_all" column.
  4. https://sdg.data.gov/4-2-1/
    • ncludes many disaggregations already, but needs an "all" column
  5. https://sdg.data.gov/4-3-1/
    • includes many disaggregations already, but needs an "all" column
  6. https://sdg.data.gov/4-5-1/
    • includes many disaggregations already, but needs an "all" column
  7. https://sdg.data.gov/4-6-1/
    • same as 4-5-1 above
  8. https://sdg.data.gov/4-c-1/
    • includes each level of certification, but needs an "all" column
  9. https://sdg.data.gov/5-4-1/
    • includes many disaggregations already, but needs these columns:
      • an "all" column
      • columns for each age value (7 columns)
      • columns for each gender value (2 columns)
      • columns for each activity value (4 columns)
  10. https://sdg.data.gov/5-b-1/
    • includes male and female, but needs an "all" column
  11. https://sdg.data.gov/8-5-1/
    • includes many gender/age combinations, but needs these columns:
      • an "all" column (1 column)
      • columns for each age value (9 columns)
  12. https://sdg.data.gov/8-5-2/
    • includes all gender/able-bodiedness/age combinations already, but also needs these aggregated columns:
      • "all" column for the whole population (1 column)
      • columns for each age value (10 columns)
      • columns for each gender value (2 columns)
      • columns for each able-bodiedness value (2 columns)
  13. https://sdg.data.gov/8-8-1/
    • includes all fatality/gender combinations, but needs these aggregated columns:
      • "all" column (1 column)
      • columns for each fatality value (2 columns)
      • columns for each gender value (2 columns)
  14. https://sdg.data.gov/8-a-1/
    • includes columns for commitments and disbursements, but needs an "all" column
    • or alternatively if that doesn't make sense, make commitments vs. disbursements a unit of measurement?
  15. https://sdg.data.gov/9-1-2/
    • freight vs. passenger will be units of measurement, but this still needs 2 general columns: "freight_vol_all" and "pass_vol_all".
  16. https://sdg.data.gov/16-1-1/
    • has good disaggregation but the disaggregated columns need to use the same units of measurement
  17. https://sdg.data.gov/17-6-2/
    • includes columns for three speeds, but needs an "all" column

@Kali2017SDG I may be off-base with the above, but I think it's worth checking into. Do you have any thoughts? If it helps, I could create each item as a separate Github issue, and we could loop in the data provider for that particular indicator.

brockfanning commented 6 years ago

It may be worth considering: can these aggregate columns be computed, rather than manually entered? This may depend on the indicator, but presumably any sum or average aggregates could be computed by the platform, which would save work for the data providers.

To take an example, the first one: https://sdg.data.gov/2-2-2/ which needs an "all" column. The platform could easily add the 2 types of malnutrition (0.6% wasting and 8.1% overweight) together to get an "all" column of 8.7%.

JenPark9 commented 6 years ago

Thank you, Brock. I think for the most part, we will need to ask the data providers for the suitable "total" statistic. Some of the categories are not things that conceptually should be totaled--for example, adding wasting and overweight in 2.2.2. I think this is something that Kali could reach out to individual data providers about. Let me say I really appreciate your looking at this so closely and coming up with next steps to try!

brockfanning commented 6 years ago

@JenPark9 Sounds great, thank you!