NIEHS / beethoven

BEETHOVEN is: Building an Extensible, rEproducible, Test-driven, Harmonized, Open-source, Versioned, ENsemble model for air quality
https://niehs.github.io/beethoven/
Other
4 stars 0 forks source link

Spin off packages #255

Closed kyle-messier closed 6 months ago

kyle-messier commented 7 months ago

Steps of the analysis could be useful as stand alone packages - then the AP model package could be simplied. Potential options:

sigmafelix commented 7 months ago

@Spatiotemporal-Exposures-and-Toxicology Data process has the longest list of dependencies: Data download: rvest, httr, stringr, testthat Data process: sf, terra, exactextractr, stars (lwgeom, FNN), data.table, dplyr, foreach, doParallel, parallelly, rlang Model fit: BART, ranger, xgboost, torch, data.table

If we split the package, separate packages for data download / process / stdt+starray / cross-validation would be reasonable and reusable for other purposes. The main package will serve to run a pipeline with Remotes: dependencies in DESCRIPTION.

kyle-messier commented 7 months ago

We can discuss at the group meeting.

Additionally, for fitting models, we may want to consider the tidymodels ecosystem - it could have the majority of models we need.

kyle-messier commented 7 months ago

@sigmafelix On second thought, while we can use tidymodels for some or all of the models, I don't think we need to force ourselves to do so. Using native packages will provide more control.

sigmafelix commented 7 months ago

@Spatiotemporal-Exposures-and-Toxicology Most of our base learner functions are tested, so I will move them back to the ./R in the next push.

kyle-messier commented 7 months ago
sigmafelix commented 6 months ago

Functions that were in main branch are transferred to amadeus. @mitchellmanware and I will work on merging and refactoring codes in amadeus repository. Thanks!