ajdamico / lodown

locally download and prepare publicly-available microdata
GNU General Public License v3.0
97 stars 47 forks source link

Fix NHANES download #179

Closed LukasWallrich closed 3 years ago

LukasWallrich commented 3 years ago

This PR fixes ajdamico/asdfree#357 and a couple of related errors/warnings.

Specifically, inside lodown_nhanes(), there were a couple of download errors because catalog is a tibble, which retains type when subset with single []. This led to the URL and then also the filename not being processed correctly, so that I replaced it by [[]]

In lodown(), I changed the default value of case_count from NA to NAinteger. This is because a column of all NAs in a tibble gets the type logical, so that catalog[ i , 'case_count' ] <- nrow( x ) in lodown_nhanes() failed with a type conversion error prior to that change. Similarly, tibbles issue warnings when you try to access columns that don't exist, so that I added to if clauses to the code that creates the unique_directories.

I hope that all makes sense - please let me know if anything is unclear.

ajdamico commented 3 years ago

thank you Lukas!!

your changes to lodown.R look quite good!

rather than using [[]] could you instead just coerce the catalog back to a data.frame? the behavior of tibbles changed here since i wrote this, i'd rather keep catalog objects in base R

if you just revise that part, i'll accept the pull request !

i hugely appreciate your time

LukasWallrich commented 3 years ago

Coercing it back to a data.frame is indeed simpler - I just thought you might want to move towards tibble compatibility, given that quite a few catalogs use rvest::html_table which returns tibbles ... but I have simplified the PR as requested.

ajdamico commented 3 years ago

yeah, that's a behavior change in tidyverse that snuck up recently and i ought to undo :-/

i really appreciate it!