NIEHS / beethoven

BEETHOVEN is: Building an Extensible, rEproducible, Test-driven, Harmonized, Open-source, Versioned, ENsemble model for air quality
https://niehs.github.io/beethoven/
Other
4 stars 0 forks source link

`crew` and `apptainer` based refactoring #361

Open kyle-messier opened 2 days ago

kyle-messier commented 2 days ago

Refactoring Pipeline for crew and apptainer

Notes and Checklist for Updating, Started by @kyle-messier

Design targets for optimal parallelization and updating

Download

Result: raw data downloaded Updating: Skip based on pattern

  • filename config
  • branching by file
  • pattern: dataset, variable, year, {location?}
  • set_args_download is a good start for a config file
  • [ ] set_args_download output should be same length/size

Process

Result: sf and terra objects for aqs and covariates

  • Process branching can mirror the download branches
  • Merge branches by dataset
kyle-messier commented 2 days ago

@mitchellmanware @sigmafelix

kyle-messier commented 1 day ago

@sigmafelix Can you provide some context on how mod06_links_2018_2022.csv was generated?

sigmafelix commented 1 day ago

@kyle-messier

Thank you for the suggestion and I understand this direction will be necessary as the software environment gets too difficult to figure out just for making everything work. As far as I recall, this is our third time revamping a significant portion of the pipeline. Given that the primary objective is to present the proof of concept at this stage of development, I'm a bit unsure how long it takes to resolve everything in the course of refactoring, which will add more time to advance the project.

Concerns aside, I would like to comment on the checklist:

sigmafelix commented 1 day ago

@kyle-messier

According to download_modis documentation: https://github.com/NIEHS/amadeus/blob/541bd6898f9e9aa8890a39b95ea1268e25977615/R/download.R#L2160-L2161.

https://ladsweb.modaps.eosdis.nasa.gov/search/order/4/MOD06_L2--61/[date1]..[date2]/DNB/-130,52,-60,20

We ask users to query MOD06_L2 products using a date range and a spatial extent in the linked page above to download a CSV file with file links.

sigmafelix commented 5 hours ago

Refactoring (recoding, actually) idea: crew-based

Download Calculate Model
Feature1-Period1 Feature1-Period1
Feature1-Period2 Feature1-Period2
Feature2-Period1 Feature2-Period1
Feature2-Period1 Feature2-Period1
... ...
FeatureP-Period2 FeatureP-Period2

Question and future work