Closed timcdlucas closed 5 years ago
Fork is here. https://github.com/OJWatson/malariaAtlas/tree/master
Hi @OJWatson,
I'm trying to get this to work and failling.
The line geo <- get_datasets(dats)
gives me
Logging into DHS website...
Error in names(filedatatypelist_DHS) <- paste0("filedatatypelist_", qdapRegex::rm_between(filedatatypelist_DHS_line, :
'names' attribute [1] must be the same length as the vector [0]
The only thing I could think was that perhaps it should have been geo <- get_datasets(dats$FileName)
but that gave me the same error.
I logged in using my own email and project name. It seemed to work.
I started digging to work out what's wrong but it quickly got deep into stuff I had no idea about. Any ideas what's wrong?
Thanks in advance.
Hmm okay, so it seems to be erroring at the stage where rdhs
goes to the Download Manager tab. A couple of things to try:
get_datasets(dats)
could you debug the following debug(rdhs:::available_datasets)
. Then as you step through you'll reach the following lines: # Grab the content from that and start creation for last post request
writeBin(z$content, tf)
# load the text
y <- readLines(tf, warn = FALSE)
Could you dump and upload what y looks like here. This should be the Download Manager web page, from which I grab all the selectable download options before making another POST request to create the url with all the download links available for your account. In grabbing the selectable options the error is thrown due to not finding any selectable options. So if you can see them in step 1, then this should let me know what's going on.
Thanks again for trying it out and trying to get this to work,
All the best,
OJ
This is going to turn into one of those things where it's me being a complete idiot... sorry if that's the case.
I can't see a download manager tab and ctrl + f isn't finding me anything similar.
I get to this page:
and then choosing a region gets me to here:
Which is all at https://dhsprogram.com/data/dataset_admin/index.cfm.
ps I don't think I'm accidentally posting screen shots of private information other than my email address. If you notice something can you let me know and I'll delete it...
okay this makes more sense. (and i don't think you're posting anything private).
So to access the DHS datasets, you have to first make the account with a project name and then request dataset access. So in that second screenshot if you select all the datasets available, then in a day or when the DHS has approved your request, then you should have a Download Manager available.
Oh great thanks. I won't even count that as me being totally stupid.
I'll set that up and get back to you. I'll also make sure to document this carefully in this package. Given the sideways way I've started using rdhs I've never even read the docs so no idea if it's in there. But this perhaps highlights a useful place to put an informative error message.
Thanks again!
Hey, yeah agree there should be a message to flag this up. Will make an issue for this. Thanks and let me know how it goes once you have datasets access.
From: Tim Lucas notifications@github.com Sent: Friday, February 15, 2019 1:07:56 PM To: malaria-atlas-project/malariaAtlas Cc: Watson, Oliver; Mention Subject: Re: [malaria-atlas-project/malariaAtlas] Use rdhs for dhs data (#30)
Oh great thanks. I won't even count that as me being totally stupid.
I'll set that up and get back to you. I'll also make sure to document this carefully in this package. Given the sideways way I've started using rdhs I've never even read the docs so no idea if it's in there. But this perhaps highlights a useful place to put an informative error message.
Thanks again!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/malaria-atlas-project/malariaAtlas/issues/30#issuecomment-464044724, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AOiwnTJxTMw87OmXW7fkPHipTZ2VTHovks5vNrEsgaJpZM4ZkQYz.
Hi @OJWatson. Just to say sorry I'm being so slow with this. I didn't get it to work and didn't find time to work out why.
Can't remember if I said that I was on paternity leave for the last 6 months. I'm now back at work so maybe I'll find time soonish.
No worries at all and congrats. There is no rush at all from my end and i'm nearing the end of my PhD so side projects (like rdhs
) I will also be slow responding to as well.
Cheers and good luck with finishing the phd!
Hi,
Starting finally to look at this.
I did sign up for all datasets. So my Download Manager page now looks like yours.
> geo <- get_datasets(dats)
These requested datasets are not available from your DHS login credentials:
---
AOGE52FL.zip, AOGE61FL.ZIP, AOGE71FL.zip, BJGE61FL.ZIP, BFGE61FL.zip, BFGE71FL.zip, BUGE71FL.ZIP, CMGE61FL.zip, CDGE52FL.zip, CDGE61FL.zip, CIGE61FL.ZIP, GHGE71FL.zip, GHGE7AFL.zip, GNGE61FL.ZIP, KEGE7AFL.zip, LBGE5CFL.ZIP, LBGE61FL.ZIP, LBGE71FL.ZIP, MDGE61FL.ZIP, MDGE6AFL.zip, MDGE71FL.zip, MWGE71FL.zip, MWGE7IFL.ZIP, MLGE63FL.zip, MLGE71FL.zip, MZGE61FL.ZIP, NGGE61FL.ZIP, NGGE71FL.zip, RWGE5BFL.zip, RWGE61FL.ZIP, SNGE5AFL.zip, SNGE61FL.ZIP, SNGE6IFLSR.zip, SNGE6AFL.zip, SNGE71FLSR.ZIP, SNGE71FL.ZIP, SNGE7AFL.ZIP, SNGE7AFLSR.ZIP, SNGE7IFLSR.ZIP, SNGE7IFL.ZIP, SLGE71FL.ZIP, TZGE52FL.zip, TZGE6AFL.ZIP, TZGE7AFL.zip, TZGE7IFL.ZIP, TGGE62FL.zip, TGGE71FL.ZIP, UGGE5AFL.zip, UGGE71FL.zip, UGGE7AFL.ZIP
---
Please request permission for these datasets from the DHS website to be able to download them
So get_datasets
now runs without errors but I just don't get any data back.
I've again tried to work out what is and isn't working, but I really don't even know how to approach it as so much of the stuff is internal.
debug(rdhs:::available_datasets)
geo <- get_datasets(dats)
This doesn't step through the function line by line or anything like that. Which I guess it should do. I've never used debug
.
So I tried doing stuff like this:
client <- rdhs:::.rdhs$client
private <- client$.__enclos_env__$private
But I still get totally stuck. I got to the point where I was trying to run private$check_available_datasets(dataset_filenames)
line by line, but I don't really understand where that is defined and it uses a bunch of other stuff like self
that again I don't understand where that is or where it comes from.
So I'm afraid I'm stuck. Again, any help much appreciated!
OOok. @Danpfeffer and @shk313 got this to work no problem and it turned out to be me being an idiot. I never requested the GPS data specifically. Works for me now.
So, I'll follow up on those funny study codes. Possibly just a copy error on our side or something. Then we'll pretty much just do some careful documentation, maybe add some errors reminding people (i.e. me) to request the GPS data and add it into the package. I'll leave the issue open until the functionality is fully merged into master.
@OJWatson I guess "author" is appropriate so we'll add you as that. If for some reason you'd rather just be a "contributor" feel free to say. Thanks again!
This all added and documented. Heading to CRAN.
I couldn't work out how to get testing to work but I'll open a separate issue for that and probably won't get around to fixing it for a while.
Hi, Although this issue is closed, I would like to come back on it. I got the same error as @timcdlucas, although not for the same reasons apperently. When I call:
get_datasets("EGIR4ASV.rds")
I get the following error:
Logging into DHS website...
Error in names(filedatatypelist_DHS) <- paste0("filedatatypelist_", qdapRegex::rm_between(filedatatypelist_DHS_line, :
'names' attribute [1] must be the same length as the vector [0]
I have read @OJWatson answer that is shown below this message. In my case, I do have access to the file I am requesting, and I can see well the download manager on the DHS webiste. I have tried to debug and reached the "y". In my case "y" is a very long string looking whose first lines look like:
[1] "<!DOCTYPE html> <html lang=\"en\"> <!-- Content Copyright Macro International -->"
[2] "<!-- Page generated 2021-04-21 16:19:52 on server 1 by CommonSpot Build 10.6.0.30 (2019-10-04 12:35:29) -->"
[3] "<!-- JavaScript & DHTML Code Copyright © 1998-2019, PaperThin, Inc. All Rights Reserved. --> <head>"
[4] "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\" />"
[5] "<meta name=\"Description\" id=\"Description\" content=\"Download Datasets\" />"
[6] "<meta name=\"Generator\" id=\"Generator\" content=\"CommonSpot Build 10.6.0.30\" />"
[7] "<title>The DHS Program - Download Datasets</title> <style id=\"cs_antiClickjack\">body{display:none !important;position:absolute !important;top:-5000px !important;}</style><script type=\"text/javascript\">(function(){var chk=0;try{if(self!==top){var ts=top.document.location.href.split('/');var ws=window.document.location.href.split('/');if(ts.length<3||ws.length<3)chk=1;else if(ts[2]!==ws[2])chk=2;else if(ts[0]!==ws[0])chk=3;}}catch(e){chk=4;}if(chk===0){var stb=document.getElementById(\"cs_antiClickjack\");stb.parentNode.removeChild(stb);}else{top.location = self.location}})();</script> <script>"
[8] "var jsDlgLoader = '/data/dataset_admin/loader.cfm';"
But I am a bit clueless on what to do now. @OJWatson Does that help you in understanding what is going on? do you need the whole string?
Many thanks
Answer from @OJWatson on on Feb 15, 2019:
Hmm okay, so it seems to be erroring at the stage where
rdhs
goes to the Download Manager tab. A couple of things to try:1. With the login account that you have could you try logging in to the DHS website and then click on the Download Manager tab. This should take you to a page that looks something like this. Do you get this page?: ![image](https://user-images.githubusercontent.com/15249565/52851968-4990af00-310f-11e9-9edc-768780e92a25.png). 2. If yes then you may need to give me a bit more information. Before running `get_datasets(dats)` could you debug the following `debug(rdhs:::available_datasets)`. Then as you step through you'll reach the following lines:
# Grab the content from that and start creation for last post request writeBin(z$content, tf) # load the text y <- readLines(tf, warn = FALSE)
Could you dump and upload what y looks like here. This should be the Download Manager web page, from which I grab all the selectable download options before making another POST request to create the url with all the download links available for your account. In grabbing the selectable options the error is thrown due to not finding any selectable options. So if you can see them in step 1, then this should let me know what's going on.
Thanks again for trying it out and trying to get this to work,
All the best,
OJ
Hey,
I was sending someone who wanted to get all the data used for the malaria maps to this package and noticed the DHS coordinates were missing and then saw this issue :)
The following gets you very close to what you may want. I've started it in a fork, but there were a couple of
dhs_ids
i could not match correctly within the DHS surveys which are commented in the code below.Most the function documentation is the same as that for
rdhs::set_rdhs_config
that does the auth bits for you.Let me know what you think/any ideas on the odd dhs_ids
Ta, OJ
Originally posted by @OJWatson in https://github.com/malaria-atlas-project/malariaAtlas/issues/5#issuecomment-449117069