USAID-OHA-SI / TeamTracking

Project Management Tracker for the SI Team
0 stars 0 forks source link

South Sudan HTS pivots and DP calculation checks #111

Closed achafetz closed 5 years ago

achafetz commented 5 years ago

@mmdessie1 asked for assistance on two items:

achafetz commented 5 years ago

Ran the three tables it the data pack provided (v15)

#dependencies
library(tidyverse)
library(tameDP)

#import
  df_ssd <- tame_dp("../Downloads/V15_DataPack_South Sudan_06_03-2019_10pm.xlsx")

#HTS dataset
  df_hts <- df_ssd %>% 
    filter(indicator == "HTS_TST",
           disaggregate == "Age/Sex/Result") 

#function
tab_hts <- function(df, ...){

  group_vars <- quos(...)

  df %>% 
    mutate(resultstatus = case_when(resultstatus == "Negative" ~ "HTS_TST_NEG",
                                    resultstatus == "Positive" ~ "HTS_TST_POS")) %>% 
    group_by_at(vars(!!!group_vars, resultstatus)) %>% 
    summarise(fy2020_targets = sum(fy2020_targets)) %>% 
    spread(resultstatus, fy2020_targets, fill = 0) %>% 
    mutate(HTS_TST = HTS_TST_NEG + HTS_TST_POS,
            Positivity = HTS_TST_POS / HTS_TST)%>% 
    select_at(vars(!!!group_vars, HTS_TST, HTS_TST_POS, Positivity)) %>% 
    arrange(desc(HTS_TST))
}

#tables
tab_hts(df_hts, modality)
tab_hts(df_hts, mechanismid, primepartner, implementingmechanismname)
tab_hts(df_hts, psnu)
achafetz commented 5 years ago

Went through a thorough formula check comparing the original OGAC data pack to the current version. No major issues. I copied formulas down (in all non-assumption columns) and compared the totals.

On the HTS tab:

The only formula that is different that the original Data Pack sent by OGAC is HTS_INDEX: FACILITY - Contacts Tested, New Negative (X). The difference in the formula is just the correction you all made given the formula error in the original.

I guess the other issue you have on this tab is under military. I didn't intentially check military since a lot is copy and pasted, but I noticed that HTS: Total Tests to perform (BT) for the military lines is actually looking up an old version (v11). It's not advisable to have look ups to other Excel files and I would suggest fixing it.

One thing to be wary of it you are getting an error in cell BP1445. The issue is that you have 1 positive and 0 negatives and Excel isn't liking calulating the a yield of 0% as the denom. Not sure if this is a major issue for import, but wanted to flag it.

For TX_PVLS on the TX tab:

The only formula difference between this version and the original is that the formula for Lapa/Lafton County zeros out the targets for TX_PVLS (D). This removed 233 targets. I assume this is intentional, but wanted to note it.

achafetz commented 5 years ago

Alex Bolo requested to rerun the HTS tables with the update of the data pack.

achafetz commented 5 years ago

@mmdessie1 and CDC colleague asked to re-run again

achafetz commented 5 years ago

Same file name as the last check. Everything matched up. Checked with Melaku and Alex who said "Thanks, yes it is actually the same approved data pack."

achafetz commented 5 years ago

Reopening again due to revised planning letter numbers. Will re-run for Melaku now.

achafetz commented 5 years ago

Updated the spreadsheet via the code above and sent back to Melaku.

achafetz commented 5 years ago

Reopening. Melaku wants HTS_POS, TX_NEW and TX_CURR targets by IM and PSNU. Will send these tables and send him the tameDP output

achafetz commented 5 years ago

Done.