ropensci / rdhs

API Client and Data Munging for the Demographic and Health Survey Data
https://docs.ropensci.org/rdhs
Other
35 stars 10 forks source link

Error in names(filedatatypelist_DHS) <- paste0("filedatatypelist_", qdapRegex::rm_between(filedatatypelist_DHS_line, : 'names' attribute [16] must be the same length as the vector [1] #131

Closed chilochibi closed 2 years ago

chilochibi commented 2 years ago

Hello, I have seen that this issue has been addressed but I seem not to go through to download the datasets. I have access to two projects on the DHS site. However, I am also getting the error below. I followed the step by step example provided in the link below. https://www.rdocumentation.org/packages/rdhs/versions/0.7.3

Error in names(filedatatypelist_DHS) <- paste0("filedatatypelist_", qdapRegex::rm_between(filedatatypelist_DHS_line,  : 
  'names' attribute [16] must be the same length as the vector [1]
jeffeaton commented 2 years ago

Thanks for reporting this. I cannot tell from the above what you are trying to do or what commands are giving rise to the error.

Can you please post the code that you are trying to run that gives rise to the error and the full console output?

Thanks, Jeff

chilochibi commented 2 years ago

Thanks @jeffeaton for the response, below is the console output. I am trying to access full DHS datasets by downloading using the get_datasets function.

`> library(rdhs)

sc <- dhs_survey_characteristics() sc[grepl("Malaria", sc$SurveyCharacteristicName), ] SurveyCharacteristicID SurveyCharacteristicName 71 96 Malaria DBS 72 90 Malaria microscopy 73 124 Malaria microscopy 74 119 Malaria microscopy - thin smear 75 57 Malaria questions 76 89 Malaria RDT ids <- dhs_countries(returnFields=c("CountryName", "DHS_CountryCode"))

survs <- dhs_surveys(surveyCharacteristicIds = 89, countryIds = c("CD","TZ"), surveyYearStart = 2013) datasets <- dhs_datasets(surveyIds = survs$SurveyId, fileFormat = "FL", fileType = "PR") str(datasets) 'data.frame': 3 obs. of 13 variables: $ FileFormat : chr "Flat ASCII data (.dat)" "Flat ASCII data (.dat)" "Flat ASCII data (.dat)" $ FileSize : int 6595349 6491292 2171918 $ DatasetType : chr "Survey Datasets" "Survey Datasets" "Survey Datasets" $ SurveyNum : int 421 485 529 $ SurveyId : chr "CD2013DHS" "TZ2015DHS" "TZ2017MIS" $ FileType : chr "Household Member Recode" "Household Member Recode" "Household Member Recode" $ FileDateLastModified: chr "September, 19 2016 09:58:23" "September, 28 2019 17:58:28" "June, 11 2019 15:38:22" $ SurveyType : chr "DHS" "DHS" "MIS" $ SurveyYearLabel : chr "2013-14" "2015-16" "2017" $ SurveyYear : chr "2013" "2015" "2017" $ DHS_CountryCode : chr "CD" "TZ" "TZ" $ FileName : chr "CDPR61FL.ZIP" "TZPR7BFL.ZIP" "TZPR7IFL.ZIP" $ CountryName : chr "Congo Democratic Republic" "Tanzania" "Tanzania"

set_rdhs_config(email = "myemaill@gmail.com", project = "Net ownership by individual", config_path = "rdhs.json", cache_path = "project_one", password_prompt = TRUE, global = FALSE) Writing your configuration to: -> rdhs.json

microbenchmark::microbenchmark(dhs_surveys(surveyYear = 2015),times = 1) Unit: milliseconds expr min lq mean median uq max neval dhs_surveys(surveyYear = 2015) 3.0491 3.0491 3.0491 3.0491 3.0491 3.0491 1

microbenchmark::microbenchmark(dhs_surveys(surveyYear = 2015), times = 1) Unit: milliseconds expr min lq mean median uq max neval dhs_surveys(surveyYear = 2015) 3.1685 3.1685 3.1685 3.1685 3.1685 3.1685 1

downloads <- get_datasets(datasets$FileName) Logging into DHS website... Error in names(filedatatypelistDHS) <- paste0("filedatatypelist", qdapRegex::rm_between(filedatatypelist_DHS_line, : 'names' attribute [16] must be the same length as the vector [1] `

jeffeaton commented 2 years ago

Thanks very much. I am not familiar with that error unfortunately.

From reviewing the console output, it looks like you are trying to download three MIS datasets, correct? Could you try the following code to see if that gives you the same error?

library(rdhs)
datasets <- c("CDPR61FL.ZIP", "TZPR7BFL.ZIP", "TZPR7IFL.ZIP")
downloads <- get_datasets(datasets)
chilochibi commented 2 years ago

Yes, that's correct. I have tried to run the code above but still no luck. I am still getting the same error. See out put below

`library(rdhs)

datasets <- c("CDPR61FL.ZIP", "TZPR7BFL.ZIP", "TZPR7IFL.ZIP") downloads <- get_datasets(datasets) Logging into DHS website... Error in names(filedatatypelistDHS) <- paste0("filedatatypelist", qdapRegex::rm_between(filedatatypelist_DHS_line, : 'names' attribute [16] must be the same length as the vector [1]`

jeffeaton commented 2 years ago

Thanks for checking. It looks like the function is failing while creating the list of available datasets that have been approved for your project: https://github.com/ropensci/rdhs/blob/c368bdbb1cbb62f227d43e1fc83e8b8250a5a0a5/R/authentication.R#L101-L110

Are you able to download these data sets when you login via the DHS webpage? E.g. from here: https://dhsprogram.com/data/dataset/Tanzania_MIS_2017.cfm?flag=0

To make sure the authentication for your account and project is working, can you try the following and ensure it returns a valid proj_id

library(rdhs)
my_config <- get_rdhs_config()
rdhs:::authenticate_dhs(my_config)

This is what I get back:

> rdhs:::authenticate_dhs(my_config)
Logging into DHS website...
$user_name
[1] "jeffrey.eaton@imperial.ac.uk"

$user_pass
[1] "<REDACTED>"

$proj_id
[1] "75312"

(Make sure not to copy/paste the full output -- it contains your account login password)

chilochibi commented 2 years ago

Yes, I am able to download all DHS/MIS datasets manually. Below is what I get after checking authentication.

`

my_config <- get_rdhs_config() rdhs:::authenticate_dhs(my_config) Logging into DHS website... $user_name [1] "cchiziba@gmail.com"

$user_pass [1] "REDACTED"

$proj_id [1] "159623"`

horaciochacon commented 2 years ago

I am having the exact same problem when trying to download datasets for which I have access:

> dhs_datasets <- get_datasets(dataset_filenames = "PKPR71FL.ZIP") Logging into DHS website... Error in names(filedatatypelist_DHS) <- paste0("filedatatypelist_", qdapRegex::rm_between(filedatatypelist_DHS_line, : 'names' attribute [16] must be the same length as the vector [1]

jeffeaton commented 2 years ago

Thanks very much. I'm a bit stumped on this and not able to reproduce.

@OJWatson -- any ideas?

OJWatson commented 2 years ago

Hey @jeffeaton, @horaciochacon, @chilochibi,

Thanks all for the helpful debugs. Think this was due to a new version of qdapRegex on CRAN (which won't get picked up by the CRAN checks as available_datasets requires a DHS login).

I have just merged a fix in for this. Please have a go redownloading rdhs v0.7.4 and let me know if this fixes it.

OJ

horaciochacon commented 2 years ago

Hi @OJWatson, this definitely solved the issue. Thanks!

OJWatson commented 2 years ago

Great to hear. Will close this then now.

mohankhanal19 commented 1 year ago

how to download rdhs v0.7.4