Drexel-UHC / download-scripts

2 stars 0 forks source link

NDI #2

Open reynegad opened 1 year ago

reynegad commented 1 year ago

I ran the NDI code and result give me 6 exactly same files. each files contains 6 rows for each function we need. would it be possible to separat them in r code or I have to clean data manually?

ran-codes commented 1 year ago

@reynegad thanks for brining this up 😄

I will take on Thursday and get back to you!

ran-codes commented 1 year ago

@reynegad can you clarify your issue a bit? I responded a bit below but I don't understand your question completely. Can you perhaps share a screen shot of the issue?

6 exactly same files.

I ran the NDI code and result give me 6 exactly same files.

When I run it I see six different files (see file sizes). Can you clarify is this what you mean?

each files contains 6 rows for each function we need

each files contains 6 rows for each function we need.

In the results I see many rows for each file. One row for every census tract? I may have misunderstood your question. can you let me know if this is what you meant? image

reynegad commented 1 year ago

My result is different than you. I probably need to rerun the code . if I have the same issue I will get back to you.

reynegad commented 1 year ago

Hi Ran, I ran The code but got the same result like before. I will attach and here is the code I am using. Do you see any problem in it?

0. Setup ----------------------------------------------------------------

{

Dependencies

library(tidyverse) library(rstudioapi) library(datasets) library(ndi) library(bit)

Directory managment

setwd(dirname(rstudioapi::getActiveDocumentContext()$path)) }

1. Pull metrics --------------------------------------------

1.1 template ------------------------------------------------------------

{ metrics = c( 'gini', 'messer', 'krieger', 'powell_wiley', 'anthopolos', 'bravo' )

template = expand_grid( metric = metrics, state = state.abb, year = 2019) %>% mutate(uhc_id = paste(state,year,metric, sep = '-'), row = row_number()) }

1.2. Error robust pull function --------------------------------------------

politely_get_ndi_basic = function(template_row){

message_tmp = paste("pull uhc_id:", template_row$uhc_id)

tryCatch(

expr = {                      
  message(paste("Starting",message_tmp))

  if (template_row$metric%in%c('bravo')){
    ndi_output = get(template_row$metric)(
      state = template_row$state, 
      year = template_row$year,
      subgroup = c("LtHS", "HSGiE", "SCoAD"))
  } else if (template_row$metric%in%c('anthopolos')){
    ndi_output = get(template_row$metric)(
      state = template_row$state, 
      year = template_row$year,
      subgroup = "NHoLB")
  } else {
    ndi_output = get(template_row$metric)(
      state = template_row$state, 
      year = template_row$year)
  }

  output = ndi_output[[1]] %>% mutate(year = template_row$year)

  # output %>% write_csv(paste0("download/",template_row$uhc_id,".csv"))

  message(paste("Sucessful",message_tmp))
},

error = function(e){
  output = template_row %>% mutate(status = "error")
  message(e)
  message("\n !!!!!!!!!!!! Error")
},

warning = function(w){       
  output = template_row %>% mutate(status = "warning")
  message(w)
  message("\n ####There was a warning")
},

finally = {}

) return(output) }

1.3 Iterate basic pulls ---------------------

' Note we may need to repeat some of the timed out pull id's... see next section

pull_ndi_metric = function(metric){ template %>% filter(metric == metric) %>%

sample_n(2) %>%

group_by(row) %>% 
group_modify(~politely_get_ndi_basic(.x)) %>% 
ungroup()

}

results = metrics %>% map(~pull_ndi_metric(.x)) %>% set_names(metrics)

1.4 Write datasets ------------------------------------

map2(results, metrics, function(result, metric){ file_name = paste0("download/",metric,".csv") result %>% write_csv(file_name) })

pic