devinit / digital-platform

PostgreSQL/analyst → MongoDB → Development Data Hub
http://data.devinit.org:8888/#!/ & http://data.devinit.org/#!/
3 stars 12 forks source link

Warehouse summary, checking the new data #245

Closed xriss closed 7 years ago

xriss commented 8 years ago

I can see that some of the data is available with slightly different table names, eg

fact.population_by_age_0_14 -> country-year/population-0-14

@xriss, yes. The tables that have names different from the file names in the GitHub repository are (not accounting for possible typos):

no series id ddw schema name ddw table name
1 country-year oda fact oda_2012
2 country-year oda-donor/oda-[id_from] fact oda_donor_2012 WHERE from_di_id = 'id_from'
3 country-year population-0-14 fact population_by_age_0_14
4 country-year population-15-64 fact population_by_age_15_64
5 country-year population-65- fact population_by_age_65_and_above
6 country-year total-revenue-pct-GDP data_series total_revenue_pct_gdp
7 country-year total-revenue-PPP-capita data_series total_revenue_ppp_capita

In 6 & 7 we've changed uppercase to lowercase only.

(I do a simple replace of _ - and then a string compare of the filename to try and auto find the matching csv table so these are not going to be picked up until I add them in explicitly.) @xriss, OK, got it. The above will need to be added then.

There is the oda and oda_donor which should be split and we should use the _2012 version (right?), @xriss, that's right:

however, it still looks like we are missing some data.

Take a look at all the other MISSING FROM WAREHOUSE lines below and please advise where that data should come from. @xriss, OK, having a look now.

The following DW tables are not going to be used:

UNUSED WAREHOUSE TABLE: dac_country_deflator.2015_10_15
UNUSED WAREHOUSE TABLE: dac_country_deflator.2015_10_15_pivoted
UNUSED WAREHOUSE TABLE: dac_country_deflator.2016_01_20
UNUSED WAREHOUSE TABLE: dac_country_deflator.2016_01_20_pivoted
UNUSED WAREHOUSE TABLE: data.2015_09_17
UNUSED WAREHOUSE TABLE: data.2016_01_20
UNUSED WAREHOUSE TABLE: deflator.2015_oct
UNUSED WAREHOUSE TABLE: deflator.2015_oct_pivoted
UNUSED WAREHOUSE TABLE: dimension.di_id
UNUSED WAREHOUSE TABLE: dimension.di_id_to_iso_3166_1_map
UNUSED WAREHOUSE TABLE: dimension.di_itep_channel
UNUSED WAREHOUSE TABLE: dimension.di_itep_sector
UNUSED WAREHOUSE TABLE: dimension.di_oda_aid_bundle
UNUSED WAREHOUSE TABLE: dimension.imf_weo_country
UNUSED WAREHOUSE TABLE: dimension.imf_weo_country_to_di_id_map
UNUSED WAREHOUSE TABLE: dimension.iso_3166_1
UNUSED WAREHOUSE TABLE: dimension.oecd_crs_channel_code_5_digit
UNUSED WAREHOUSE TABLE: dimension.oecd_crs_channel_code_5_digit_to_di_itep_channel_map
UNUSED WAREHOUSE TABLE: dimension.oecd_crs_donor
UNUSED WAREHOUSE TABLE: dimension.oecd_crs_purpose_code_5_digit
UNUSED WAREHOUSE TABLE: dimension.oecd_crs_purpose_code_5_digit_to_di_itep_sector_map
UNUSED WAREHOUSE TABLE: dimension.oecd_crs_recipient
UNUSED WAREHOUSE TABLE: dimension.oecd_crs_sector_code_3_digit
UNUSED WAREHOUSE TABLE: dimension.oecd_crs_sector_code_3_digit_to_di_itep_sector_map
UNUSED WAREHOUSE TABLE: dimension.oecd_dac_2a_recipient
UNUSED WAREHOUSE TABLE: dimension.oecd_dac_2b_recipient
UNUSED WAREHOUSE TABLE: dimension.oecd_dac_donor
UNUSED WAREHOUSE TABLE: dimension.oecd_deflator_lookup
UNUSED WAREHOUSE TABLE: dimension.oecd_deflator_pivoted_dac_2015_10_15_non_dac_2015_09_17
UNUSED WAREHOUSE TABLE: dimension.oecd_deflator_pivoted_dac_2016_01_20_non_dac_2016_01_20
UNUSED WAREHOUSE TABLE: dimension.oecd_donor
UNUSED WAREHOUSE TABLE: dimension.oecd_donor_to_di_id_map
UNUSED WAREHOUSE TABLE: dimension.oecd_donor_to_iso_3166_1_map
UNUSED WAREHOUSE TABLE: dimension.oecd_donor_type
UNUSED WAREHOUSE TABLE: dimension.oecd_loc_donor
UNUSED WAREHOUSE TABLE: dimension.oecd_loc_recipient
UNUSED WAREHOUSE TABLE: dimension.oecd_recipient
UNUSED WAREHOUSE TABLE: dimension.oecd_recipient_income_group
UNUSED WAREHOUSE TABLE: dimension.oecd_recipient_to_di_id_map
UNUSED WAREHOUSE TABLE: dimension.oecd_recipient_to_iso_3166_1_map
UNUSED WAREHOUSE TABLE: dimension.wb_wdi_country
UNUSED WAREHOUSE TABLE: dimension.wb_wdi_country_to_di_id_map
UNUSED WAREHOUSE TABLE: dimension.wb_wdi_country_to_imf_weo_country_map
UNUSED WAREHOUSE TABLE: fact.gdp_usd_current_2012
UNUSED WAREHOUSE TABLE: fact.gni_pc_usd_current_2012
UNUSED WAREHOUSE TABLE: fact.gni_usd_current_2012
UNUSED WAREHOUSE TABLE: fact.income_share_by_quintile_2nd
UNUSED WAREHOUSE TABLE: fact.income_share_by_quintile_3rd
UNUSED WAREHOUSE TABLE: fact.income_share_by_quintile_4th
UNUSED WAREHOUSE TABLE: fact.income_share_by_quintile_5th

@kriss, that's right, the above DW tables are not needed for your purposes.

The following DW tables are not going to be used:

UNUSED WAREHOUSE TABLE: fact.oda_2012
UNUSED WAREHOUSE TABLE: fact.oda_donor_2012
UNUSED WAREHOUSE TABLE: fact.population_by_age_0_14
UNUSED WAREHOUSE TABLE: fact.population_by_age_15_64
UNUSED WAREHOUSE TABLE: fact.population_by_age_65_and_above

@kriss, the above DW tables are needed, see table at the top with summary of name changes.

The following DW tables are not going to be used:

UNUSED WAREHOUSE TABLE: fact.oda_donor
UNUSED WAREHOUSE TABLE: non_dac_country_deflator.2015_09_17
UNUSED WAREHOUSE TABLE: non_dac_country_deflator.2015_09_17_pivoted
UNUSED WAREHOUSE TABLE: non_dac_country_deflator.2016_01_20
UNUSED WAREHOUSE TABLE: non_dac_country_deflator.2016_01_20_pivoted
UNUSED WAREHOUSE TABLE: public.di_concept_in_ddw
UNUSED WAREHOUSE TABLE: public.di_concept_in_dh
UNUSED WAREHOUSE TABLE: public.individual_wb_wdi_series_in_di_dh
UNUSED WAREHOUSE TABLE: series.2015_09_17
UNUSED WAREHOUSE TABLE: series.2016_01_20

@kriss, that's right, the above DW tables are not needed for your purposes.

The following DP csv files are not going to be replaced with DW data

MISSING FROM WAREHOUSE: country-year/adult-literacy
MISSING FROM WAREHOUSE: country-year/domestic-netlending
MISSING FROM WAREHOUSE: country-year/education-pc-transferred-oda
MISSING FROM WAREHOUSE: country-year/employment-agriculture
MISSING FROM WAREHOUSE: country-year/employment-by-sector
MISSING FROM WAREHOUSE: country-year/employment-industry
MISSING FROM WAREHOUSE: country-year/employment-services
MISSING FROM WAREHOUSE: country-year/gdp-current-ncu-fy
MISSING FROM WAREHOUSE: country-year/gdp-growth
MISSING FROM WAREHOUSE: country-year/gdp-pc-usd-2005
MISSING FROM WAREHOUSE: country-year/gdp-pc-usd-current
MISSING FROM WAREHOUSE: country-year/gdp-usd-2005
MISSING FROM WAREHOUSE: country-year/gdp-usd-2012
MISSING FROM WAREHOUSE: country-year/gni-usd-2005
MISSING FROM WAREHOUSE: country-year/govtspend-USD
MISSING FROM WAREHOUSE: country-year/health-pc-transferred-oda
MISSING FROM WAREHOUSE: country-year/income-share-top-10pc
MISSING FROM WAREHOUSE: country-year/infant-mortality
MISSING FROM WAREHOUSE: country-year/in-oda-and-repayments
MISSING FROM WAREHOUSE: country-year/in-oof-and-repayments
MISSING FROM WAREHOUSE: country-year/in-oof-net
MISSING FROM WAREHOUSE: country-year/intl-flows-donors-wide
MISSING FROM WAREHOUSE: country-year/intl-flows-recipients-wide
MISSING FROM WAREHOUSE: country-year/kenya-electricity-avg
MISSING FROM WAREHOUSE: country-year/kenya-electricity-rank
MISSING FROM WAREHOUSE: country-year/kenya-improved-sani-avg
MISSING FROM WAREHOUSE: country-year/kenya-improved-sani-rank
MISSING FROM WAREHOUSE: country-year/kenya-improved-water-avg
MISSING FROM WAREHOUSE: country-year/kenya-improved-water-rank
MISSING FROM WAREHOUSE: country-year/kenya-paved-roads-avg
MISSING FROM WAREHOUSE: country-year/kenya-paved-roads-rank
MISSING FROM WAREHOUSE: country-year/kenya-pov-avg
MISSING FROM WAREHOUSE: country-year/kenya-pov-rank
MISSING FROM WAREHOUSE: country-year/kenya-urban-avg
MISSING FROM WAREHOUSE: country-year/kenya-urban-rank
MISSING FROM WAREHOUSE: country-year/long-term-debt
MISSING FROM WAREHOUSE: country-year/mean-years-of-schooling

@kriss, that's right. My understanding is that these data series/.csv files are not even used in the DH so do not need to be replaced with DW data. @timstrawson, can you please confirm?

The following DP csv files are not going to be replaced with DW data

MISSING FROM WAREHOUSE: country-year/oda-donor/oda-AE
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-afdb
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-afdf
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-afesd
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-arab-fund-afesd
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-asdb-special-funds
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-AT
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-AU
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-badea
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-BE
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-CA
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-CH
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-CZ
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-DE
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-DK
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-ebrd
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-EE
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-ES
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-EU
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-FI
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-FR
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-gavi
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-GB
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-gef
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-global-fund
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-GR
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-ibrd
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-ida
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-idb-specialfund
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-IE
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-ifad
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-imf
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-imf-concessional-trust-fund
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-IS
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-islamic-dev-bank
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-IT
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-JP
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-KR
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-KW
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-LU
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-NL
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-NO
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-nordic-dev-fund
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-NZ
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-ofid
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-osce
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-PL
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-PT
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-SE
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-SI
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-SK
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-unaids
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-undp
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-unece
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-unfpa
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-unhcr
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-unicef
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-unpbf
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-unrwa
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-US
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-wfp
MISSING FROM WAREHOUSE: country-year/oda-donor/oda-who

@kriss, these do need to be replaced with DW data. The data for these is in fact.oda_donor_2012. To get at the individual donor data, we need to filer on the di_id for the relevant years so:

country-year/oda-donor/oda-who = SELECT * FROM fact.oda_donor_2012 WHERE from_di_id = 'who' AND year BETWEEN 2006 AND 2014
country-year/oda-donor/oda-US = SELECT * FROM fact.oda_donor_2012 WHERE from_di_id = 'US' AND year BETWEEN 2006 AND 2014
etc.

@xriss, watch out for:

Delete:

MISSING FROM WAREHOUSE: country-year/oda-donor/oda-afesd

Use:

MISSING FROM WAREHOUSE: country-year/oda-donor/oda-arab-fund-afesd

instead.

@xriss, more to watch out for:

More info about this here: https://github.com/devinit/digital-platform/issues/243

The following DP csv files are not going to be replaced with DW data

MISSING FROM WAREHOUSE: country-year/out-oda-and-repayments
MISSING FROM WAREHOUSE: country-year/out-oda-gross
MISSING FROM WAREHOUSE: country-year/out-oof-and-repayments
MISSING FROM WAREHOUSE: country-year/out-oof-gross

@kriss, that's right. These do not need to be replaced with DW data.

The following DP csv files are not going to be replaced with DW data

MISSING FROM WAREHOUSE: country-year/poorest20pct-percentages

@kriss, this one does need to be replaced with DW data, but is genuinely missing from DW. I will correct this & let you & @notshi know when I have.

The following DP csv files are not going to be replaced with DW data

MISSING FROM WAREHOUSE: country-year/population-0-14
MISSING FROM WAREHOUSE: country-year/population-15-64
MISSING FROM WAREHOUSE: country-year/population-65-

@kriss, these do need to be replaced with DW data, see table at the top with summary of name changes.

MISSING FROM WAREHOUSE: country-year/poverty-gap-125
MISSING FROM WAREHOUSE: country-year/poverty-gap-2
MISSING FROM WAREHOUSE: country-year/primary-school-enrolment
MISSING FROM WAREHOUSE: country-year/taxrev-pctGDP
MISSING FROM WAREHOUSE: country-year/total-employment
MISSING FROM WAREHOUSE: country-year/under-5-mortality
MISSING FROM WAREHOUSE: country-year/university-college-enrolment
MISSING FROM WAREHOUSE: country-year/youth-literacy
MISSING FROM WAREHOUSE: country-year/youth-unemployment

@kriss, that's right. These do not need to be replaced with DW data.

The following DP tables/selects will be exported to these DW csv files

country-year/agricultural-census.csv <- data_series."agricultural_census"
country-year/avg-income-of-extreme-poor.csv <- data_series."avg_income_of_extreme_poor"
country-year/civil-reg-births.csv <- data_series."civil_reg_births"
country-year/civil-reg-deaths.csv <- data_series."civil_reg_deaths"
country-year/climate-vulnerability.csv <- data_series."climate_vulnerability"
country-year/dac-oda-percent-gni.csv <- data_series."dac_oda_percent_gni"
country-year/dac-oda-to-ldcs-pc-gni.csv <- data_series."dac_oda_to_ldcs_pc_gni"
country-year/depth-of-extreme-poverty.csv <- data_series."depth_of_extreme_poverty"
country-year/dev-coop-in-detail.csv <- data_series."dev_coop_in_detail"
country-year/dfis-out.csv <- data_series."dfis_out"
country-year/dfis-out-dev.csv <- data_series."dfis_out_dev"
country-year/domestic.csv <- data_series."domestic"
country-year/domestic-sectors.csv <- data_series."domestic_sectors"
country-year/educ-mis.csv <- data_series."educ_mis"
country-year/evi.csv <- data_series."evi"
country-year/fdi-out.csv <- data_series."fdi_out"
country-year/fdi-pp.csv <- data_series."fdi_pp"
country-year/fragile-states.csv <- data_series."fragile_states"
country-year/general-gov-health-exp.csv <- data_series."general_gov_health_exp"
country-year/gov-revenue-pc-gdp.csv <- data_series."gov_revenue_pc_gdp"
country-year/govtspend-pc.csv <- data_series."govtspend_pc"
country-year/grants-pct-totalrevenue.csv <- data_series."grants_pct_totalrevenue"
country-year/health-mis.csv <- data_series."health_mis"
country-year/human-hazard.csv <- data_series."human_hazard"
country-year/in-ha.csv <- data_series."in_ha"
country-year/in-oda-gross.csv <- data_series."in_oda_gross"
country-year/in-oda-net.csv <- data_series."in_oda_net"
country-year/in-oof-gross.csv <- data_series."in_oof_gross"
country-year/intl-flows-donors.csv <- data_series."intl_flows_donors"
country-year/intl-flows-recipients.csv <- data_series."intl_flows_recipients"
country-year/intlresources-total.csv <- data_series."intlresources_total"
country-year/kenya-births-pc-skilled.csv <- data_series."kenya_births_pc_skilled"
country-year/kenya-electricity.csv <- data_series."kenya_electricity"
country-year/kenya-fertility-rate.csv <- data_series."kenya_fertility_rate"
country-year/kenya-improved-sani.csv <- data_series."kenya_improved_sani"
country-year/kenya-improved-water.csv <- data_series."kenya_improved_water"
country-year/kenya-paved-roads.csv <- data_series."kenya_paved_roads"
country-year/kenya-pc-female-know-hiv.csv <- data_series."kenya_pc_female_know_hiv"
country-year/kenya-pc-female-tested-hiv.csv <- data_series."kenya_pc_female_tested_hiv"
country-year/kenya-pc-house-malaria-nets.csv <- data_series."kenya_pc_house_malaria_nets"
country-year/kenya-pc-male-know-hiv.csv <- data_series."kenya_pc_male_know_hiv"
country-year/kenya-pc-male-tested-hiv.csv <- data_series."kenya_pc_male_tested_hiv"
country-year/kenya-pc-modern-contra.csv <- data_series."kenya_pc_modern_contra"
country-year/kenya-pc-no-contra.csv <- data_series."kenya_pc_no_contra"
country-year/kenya-pc-trad-contra.csv <- data_series."kenya_pc_trad_contra"
country-year/kenya-pop-female.csv <- data_series."kenya_pop_female"
country-year/kenya-pop-male.csv <- data_series."kenya_pop_male"
country-year/kenya-pop-pc-female.csv <- data_series."kenya_pop_pc_female"
country-year/kenya-pop-pc-male.csv <- data_series."kenya_pop_pc_male"
country-year/kenya-pop-total.csv <- data_series."kenya_pop_total"
country-year/kenya-pov-gap.csv <- data_series."kenya_pov_gap"
country-year/kenya-rural-pop.csv <- data_series."kenya_rural_pop"
country-year/kenya-treat-child-diarr.csv <- data_series."kenya_treat_child_diarr"
country-year/kenya-treat-child-respir.csv <- data_series."kenya_treat_child_respir"
country-year/kenya-urban-pop.csv <- data_series."kenya_urban_pop"
country-year/kenya-weight-below-3sd.csv <- data_series."kenya_weight_below_3sd"
country-year/largest-intl-flow.csv <- data_series."largest_intl_flow"
country-year/latest-census.csv <- data_series."latest_census"
country-year/latest-hh-survey.csv <- data_series."latest_hh_survey"
country-year/long-debt-disbursement-in.csv <- data_series."long_debt_disbursement_in"
country-year/long-debt-net-official-in.csv <- data_series."long_debt_net_official_in"
country-year/natural-hazard.csv <- data_series."natural_hazard"
country-year/non-grant-revenue-PPP-capita.csv <- data_series."non_grant_revenue_ppp_capita"
country-year/number-of-surveys.csv <- data_series."number_of_surveys"
country-year/oda-capital-repayments.csv <- data_series."oda_capital_repayments"
country-year/oda-interest-payments.csv <- data_series."oda_interest_payments"
country-year/oda-per-poor-person.csv <- data_series."oda_per_poor_person"
country-year/oof.csv <- data_series."oof"
country-year/out-dac-oda-net.csv <- data_series."out_dac_oda_net"
country-year/out-oof-net.csv <- data_series."out_oof_net"
country-year/out-ssc-net.csv <- data_series."out_ssc_net"
country-year/poorest20pct.csv <- data_series."poorest20pct"
country-year/poor-people.csv <- data_series."poor_people"
country-year/poverty-125.csv <- data_series."poverty_125"
country-year/poverty-200.csv <- data_series."poverty_200"
country-year/profits-pct-fdi.csv <- data_series."profits_pct_fdi"
country-year/remittances.csv <- data_series."remittances"
country-year/rems-pp.csv <- data_series."rems_pp"
country-year/ssc-out.csv <- data_series."ssc_out"
country-year/ssc-percent-gni.csv <- data_series."ssc_percent_gni"
country-year/stat-capacity.csv <- data_series."stat_capacity"
country-year/total-revenue-pct-GDP.csv <- data_series."total_revenue_pct_gdp"
country-year/total-revenue-PPP-capita.csv <- data_series."total_revenue_ppp_capita"
country-year/uganda-agri-percent.csv <- data_series."uganda_agri_percent"
country-year/uganda-anc4-coverage.csv <- data_series."uganda_anc4_coverage"
country-year/uganda-avg-house-size.csv <- data_series."uganda_avg_house_size"
country-year/uganda-central-resources.csv <- data_series."uganda_central_resources"
country-year/uganda-dependency-ratio.csv <- data_series."uganda_dependency_ratio"
country-year/uganda-deprivation-living.csv <- data_series."uganda_deprivation_living"
country-year/uganda-donor-educ-spend.csv <- data_series."uganda_donor_educ_spend"
country-year/uganda-donor-percent.csv <- data_series."uganda_donor_percent"
country-year/uganda-donor-resources.csv <- data_series."uganda_donor_resources"
country-year/uganda-dpt3-coverage.csv <- data_series."uganda_dpt3_coverage"
country-year/uganda-educ-percent.csv <- data_series."uganda_educ_percent"
country-year/uganda-finance.csv <- data_series."uganda_finance"
country-year/uganda-gov-spend-pp.csv <- data_series."uganda_gov_spend_pp"
country-year/uganda-health-funding.csv <- data_series."uganda_health_funding"
country-year/uganda-health-percent.csv <- data_series."uganda_health_percent"
country-year/uganda-health-posts.csv <- data_series."uganda_health_posts"
country-year/uganda-hmis.csv <- data_series."uganda_hmis"
country-year/uganda-household-san-cov.csv <- data_series."uganda_household_san_cov"
country-year/uganda-igf-resources.csv <- data_series."uganda_igf_resources"
country-year/uganda-ipt2-coverage.csv <- data_series."uganda_ipt2_coverage"
country-year/uganda-leaving-exam-perf-rate.csv <- data_series."uganda_leaving_exam_perf_rate"
country-year/uganda-life-expectancy.csv <- data_series."uganda_life_expectancy"
country-year/uganda-local-percent.csv <- data_series."uganda_local_percent"
country-year/uganda-overall-health.csv <- data_series."uganda_overall_health"
country-year/uganda-pop-dens.csv <- data_series."uganda_pop_dens"
country-year/uganda-poverty-headcount.csv <- data_series."uganda_poverty_headcount"
country-year/uganda-primary-educ-funding.csv <- data_series."uganda_primary_educ_funding"
country-year/uganda-primary-enrol.csv <- data_series."uganda_primary_enrol"
country-year/uganda-primary-sit-write.csv <- data_series."uganda_primary_sit_write"
country-year/uganda-primary-sit-write-gov.csv <- data_series."uganda_primary_sit_write_gov"
country-year/uganda-primary-stu-teach-ratio.csv <- data_series."uganda_primary_stu_teach_ratio"
country-year/uganda-primary-stu-teach-ratio-gov.csv <- data_series."uganda_primary_stu_teach_ratio_gov"
country-year/uganda-rural-safe-water.csv <- data_series."uganda_rural_safe_water"
country-year/uganda-rural-water-func.csv <- data_series."uganda_rural_water_func"
country-year/uganda-secondary-enrol.csv <- data_series."uganda_secondary_enrol"
country-year/uganda-secondary-sit-write.csv <- data_series."uganda_secondary_sit_write"
country-year/uganda-secondary-sit-write-gov.csv <- data_series."uganda_secondary_sit_write_gov"
country-year/uganda-secondary-stu-teach-ratio.csv <- data_series."uganda_secondary_stu_teach_ratio"
country-year/uganda-secondary-stu-teach-ratio-gov.csv <- data_series."uganda_secondary_stu_teach_ratio_gov"
country-year/uganda-tb-success.csv <- data_series."uganda_tb_success"
country-year/uganda-total-pop.csv <- data_series."uganda_total_pop"
country-year/uganda-urban-pop.csv <- data_series."uganda_urban_pop"
country-year/uganda-urban-rural-pop.csv <- data_series."uganda_urban_rural_pop"
country-year/uganda-wash-perf-score.csv <- data_series."uganda_wash_perf_score"
country-year/uganda-water-source-comm-func.csv <- data_series."uganda_water_source_comm_func"

@xriss, yes, that's right.

country-year/gdp-usd-current.csv <- fact."gdp_usd_current"
country-year/gni-pc-usd-current.csv <- fact."gni_pc_usd_current"
country-year/gni-usd-current.csv <- fact."gni_usd_current"
country-year/income-share-bottom-20pc.csv <- fact."income_share_bottom_20pc"
country-year/income-share-by-quintile.csv <- fact."income_share_by_quintile"
country-year/life-expectancy-at-birth.csv <- fact."life_expectancy_at_birth"
country-year/maternal-mortality.csv <- fact."maternal_mortality"

@xriss, yes, that's right.

country-year/oda.csv <- fact."oda"

@xriss, no, do (filtering on year BETWEEN 2006 AND 2014):

country-year/oda.csv <- fact."oda_2012"

country-year/population-by-age.csv <- fact."population_by_age"
country-year/population-rural.csv <- fact."population_rural"
country-year/population-rural-urban.csv <- fact."population_rural_urban"
country-year/population-total.csv <- fact."population_total"
country-year/population-urban.csv <- fact."population_urban"

@xriss, yes, that's right.

country-year/oda-donor/oda-[any id_from].csv <- fact."oda_donor_2012", id_from -> from_di_id
country-year/population-0-14.csv <- fact."population_by_age_0_14"
country-year/population-15-64.csv <- fact."population_by_age_15_64"
country-year/population-65-.csv <- fact."population_by_age_65_and_above"

@xriss, we need these too, see table at the top with summary of name changes.

dw8547 commented 8 years ago

@xriss, I've got 'https://github.com/devinit/digital-platform/blob/master/country-year/poorest20pct-percentages.csv' as not being used in the DH:

ddw_development=# SELECT
ddw_development-# *
ddw_development-# FROM public.di_concept_in_dh
ddw_development-# WHERE id = 'poorest20pct-percentages';
 concept_id |    series    |            id            | in_dh 
------------+--------------+--------------------------+-------
        218 | country-year | poorest20pct-percentages |     0
(1 row)

@akmiller01 helped me figure out which files were/are actually being used in the DH here: https://github.com/devinit/ddw-data/issues/124. As this one came out as in_dh = 0 I didn't import the table.

@timstrawson, can you help please? Do we leave 'poorest20pct-percentages.csv' out? Do I need to create a table it?

dw8547 commented 8 years ago

@xriss, you were right about

UNUSED WAREHOUSE TABLE: fact.gdp_usd_current_2012

Don't use this one. I'm going to update the comment above to correct this.

This is right:

country-year/gdp-usd-current.csv <- fact."gdp_usd_current"
dw8547 commented 8 years ago

@xriss, just to confirm, this is right:

country-year/gdp-usd-current.csv <- fact."gdp_usd_current"
country-year/gni-pc-usd-current.csv <- fact."gni_pc_usd_current"
country-year/gni-usd-current.csv <- fact."gni_usd_current"

and this is also right:

UNUSED WAREHOUSE TABLE: fact.gdp_usd_current_2012
UNUSED WAREHOUSE TABLE: fact.gni_pc_usd_current_2012
UNUSED WAREHOUSE TABLE: fact.gni_usd_current_2012

I got it wrong earlier today so disregard me trying to mislead you on these. They are UNUSED. I've corrected the information in the above comment.

timstrawson commented 8 years ago

Hi @dw8547 as discussed just now, I can confirm that we don't need to copy the unused tables across to the data warehouse.

@kriss, @timstrawson has confirmed that the data series/.csv files below are not even used in the DH so do not need to be replaced with DW data.

The following DP csv files are not going to be replaced with DW data

MISSING FROM WAREHOUSE: country-year/adult-literacy
MISSING FROM WAREHOUSE: country-year/domestic-netlending
MISSING FROM WAREHOUSE: country-year/education-pc-transferred-oda
MISSING FROM WAREHOUSE: country-year/employment-agriculture
MISSING FROM WAREHOUSE: country-year/employment-by-sector
MISSING FROM WAREHOUSE: country-year/employment-industry
MISSING FROM WAREHOUSE: country-year/employment-services
MISSING FROM WAREHOUSE: country-year/gdp-current-ncu-fy
MISSING FROM WAREHOUSE: country-year/gdp-growth
MISSING FROM WAREHOUSE: country-year/gdp-pc-usd-2005
MISSING FROM WAREHOUSE: country-year/gdp-pc-usd-current
MISSING FROM WAREHOUSE: country-year/gdp-usd-2005
MISSING FROM WAREHOUSE: country-year/gdp-usd-2012
MISSING FROM WAREHOUSE: country-year/gni-usd-2005
MISSING FROM WAREHOUSE: country-year/govtspend-USD
MISSING FROM WAREHOUSE: country-year/health-pc-transferred-oda
MISSING FROM WAREHOUSE: country-year/income-share-top-10pc
MISSING FROM WAREHOUSE: country-year/infant-mortality
MISSING FROM WAREHOUSE: country-year/in-oda-and-repayments
MISSING FROM WAREHOUSE: country-year/in-oof-and-repayments
MISSING FROM WAREHOUSE: country-year/in-oof-net
MISSING FROM WAREHOUSE: country-year/intl-flows-donors-wide
MISSING FROM WAREHOUSE: country-year/intl-flows-recipients-wide
MISSING FROM WAREHOUSE: country-year/kenya-electricity-avg
MISSING FROM WAREHOUSE: country-year/kenya-electricity-rank
MISSING FROM WAREHOUSE: country-year/kenya-improved-sani-avg
MISSING FROM WAREHOUSE: country-year/kenya-improved-sani-rank
MISSING FROM WAREHOUSE: country-year/kenya-improved-water-avg
MISSING FROM WAREHOUSE: country-year/kenya-improved-water-rank
MISSING FROM WAREHOUSE: country-year/kenya-paved-roads-avg
MISSING FROM WAREHOUSE: country-year/kenya-paved-roads-rank
MISSING FROM WAREHOUSE: country-year/kenya-pov-avg
MISSING FROM WAREHOUSE: country-year/kenya-pov-rank
MISSING FROM WAREHOUSE: country-year/kenya-urban-avg
MISSING FROM WAREHOUSE: country-year/kenya-urban-rank
MISSING FROM WAREHOUSE: country-year/long-term-debt
MISSING FROM WAREHOUSE: country-year/mean-years-of-schooling
dw8547 commented 7 years ago

Dead issue.