Closed kyle-messier closed 9 months ago
@Spatiotemporal-Exposures-and-Toxicology Data process has the longest list of dependencies: Data download: rvest, httr, stringr, testthat Data process: sf, terra, exactextractr, stars (lwgeom, FNN), data.table, dplyr, foreach, doParallel, parallelly, rlang Model fit: BART, ranger, xgboost, torch, data.table
If we split the package, separate packages for data download / process / stdt+starray / cross-validation would be reasonable and reusable for other purposes. The main package will serve to run a pipeline with Remotes: dependencies in DESCRIPTION.
We can discuss at the group meeting.
Additionally, for fitting models, we may want to consider the tidymodels ecosystem - it could have the majority of models we need.
@sigmafelix On second thought, while we can use tidymodels for some or all of the models, I don't think we need to force ourselves to do so. Using native packages will provide more control.
@Spatiotemporal-Exposures-and-Toxicology Most of our base learner functions are tested, so I will move them back to the ./R in the next push.
Functions that were in main
branch are transferred to amadeus
. @mitchellmanware and I will work on merging and refactoring codes in amadeus
repository. Thanks!
Steps of the analysis could be useful as stand alone packages - then the AP model package could be simplied. Potential options: