NIEHS / beethoven

BEETHOVEN is: Building an Extensible, rEproducible, Test-driven, Harmonized, Open-source, Versioned, ENsemble model for air quality
https://niehs.github.io/beethoven/
Other
4 stars 0 forks source link

Write a wrapper function for downloading data #215

Closed sigmafelix closed 7 months ago

sigmafelix commented 8 months ago

Following the discussion today, we agree on writing a wrapper function to simplify data download procedure. A basic form of the wrapper would look like:

download_data <-
  function(
    dataset_name = c("aqs", "ecoregion", "geos", "gmted", "koppen", "koppen-geiger", "koppengeiger", "merra2", "merra", "narr", "nlcd", "noaa", "sedac"),
    date_start = "2023-09-01",
    date_end = "2023-09-01",
    parameter_code,
    resolution_temporal,
    variables = NULL,
    collection = NULL,
    data_resolution,
    product = c("MOD09GA", "MOD11A1", "MOD06_L2",
                "MCD19A2", "MOD13A2", "VNP46A2"),
    version = "61",
    horizontal_tiles = c(7, 13),
    vertical_tiles = c(3, 6),
    nasa_earth_data_token = NULL,
    directory_to_save = "./input/modis/raw/",
    data_download_acknowledgement = FALSE,
    write_command_only = FALSE
  ) {

    dataset_name <- match.arg(dataset_name)

    # common elements ...
    # directory presence, sanity check in path strings
    # ** DATA DEPENDENT: identify base URL **
    # ** DATA DEPENDENT: subdataset conditionals **
    # flush wget commands
    #   - save commands in txt file
    #   - run commands
    # unzip
    # remove zips
  }

There are several discussion points:

sigmafelix commented 8 months ago

Related to NIEHS/beethoven#174 NIEHS/beethoven#189 NIEHS/amadeus#49. Will discuss with @mitchellmanware to minimize task conflicts and streamline our workflow.

sigmafelix commented 8 months ago

After a discussion, we decided on merging our changes then replace repeated lines in each download function with reusable functions I wrote.

I think that writing a wrapper function is not urgent. It will be done when we refactor the codes around wrapping up the covariate calculation milestone.

TODO until the milestone due of the covariate calculation

Mitchell Manware, December 20 Wrapper function has been incorporated into URL unit tests for each dataset. Tests are on branch mm_download_unit_tests_1214 in tests/testthat/test-download_functions.R.

sigmafelix commented 8 months ago

@mitchellmanware

I updated my branch with a wrapper function download_data() and the replacement of repeated lines with support functions (in ./input/Rinput/download_functions/download_support.R). Could you merge my branch into yours then add tests for remaining download functions? For the local test, download.R and download_support.R will need to be copied into ./R.

Thank you.

cf. Wrapper function: https://github.com/Spatiotemporal-Exposures-and-Toxicology/NRTAPmodel/blob/ead3433f4beb1f9b20d0b22d85a96e15e365b81f/input/Rinput/download_functions/download.R#L1-L80

sigmafelix commented 7 months ago

TRI/AADT (from NEI) data download functions were added to my development branch. I will close this issue.