drivendataorg / water-supply-forecast-rodeo-runtime

Data and runtime repository for the Water Supply Forecast Rodeo competition on DrivenData
https://watersupply.drivendata.org
MIT License
9 stars 18 forks source link

Please add the following R packages to the runtime #13

Closed eli-asarian closed 9 months ago

eli-asarian commented 9 months ago

Per Jay Qi's instructions (https://community.drivendata.org/t/provisions-for-r-users/9476/6), to enable me to use the R packages that want for this contest, I am requesting that the following packages be added to the runtime from conda-forge: r-plyr, r-tidyverse, r-terra, r-sf, r-zoo, r-doparallel, r-foreach, r-iterators, r-nlme, r-mgcv, r-qgam, r-visreg, r-modelr, r-yardstick, r-missmda, r-mvtnorm, r-mice, r-factominer, r-lattice, r-matrix, r-shiny, r-rcpp, r-dataretrieval-feedstock

jayqi commented 9 months ago

Hi @eli-asarian,

I've added the following requested packages in this commit.

Regarding the remaining requested packages:

eli-asarian commented 9 months ago

Thanks @jayqi! Perhaps I was overly inclusive in my list of packages. On the other hand, I am concerned that many packages have dependencies on other packages and it is difficult for me to ascertain which of these dependencies are actually required to do what I want to do and which are not necessary (i.e., may be used for parts of the package that I will not be using in my solution). Responses to your specific questions:

jayqi commented 9 months ago

Hi @eli-asarian,

You should only specify direct dependencies of your solution. You do not need to specify dependencies of those dependencies. As with any functional package manager (like R's install.packages), conda will by default figure out the necessary recursive dependencies and install them.

So for example, if you are not directly using shiny, then you should not list it. The package manager will figure out if shiny is necessary and install a proper version. (You can see that it is in fact included already in the environment because it is a subdependency.) Overspecifying unnecessary things makes the requirements more unwieldly and harder to correctly solve from possible overconstraints.

eli-asarian commented 9 months ago

Thanks for clarifying @jayqi. So then only packages that still need to be added are those listed in my second comment in this thread: r-dplyr, r-tidyr, r-lubridate, r-readr, r-stringr.

jayqi commented 9 months ago

@eli-asarian You should be all set as of this commit.

I am closing this issue to reflect that everything has been addressed. If you run into any issues, feel free to comment here and we can reopen the issue.

iamo-lsg commented 9 months ago

It turned that 'caret' in the code execution environment does not have some dependencies we rely on. Could you please add 'pls' and 'kernlab' packages.

jayqi commented 8 months ago

@iamo-lsg You should be all set: https://github.com/drivendataorg/water-supply-forecast-rodeo-runtime/commit/6ffa5db7af9fa6192e65ae119ebe4ed6f9453fc8

iamo-lsg commented 8 months ago

Hi @jayqi ,

Could you please add 'snotelr' package.

jayqi commented 8 months ago

Hi @iamo-lsg,

Can you please clarify why you need snotelr? Are you planning to use SNOTEL stations that we aren't pre-downloading?

iamo-lsg commented 8 months ago

Hi @iamo-lsg,

Can you please clarify why you need snotelr? Are you planning to use SNOTEL stations that we aren't pre-downloading?

Please disregard our request for adding the snotelr. We use inputs from snotelr, which turns to provide the data in a different format compared to the predownloaded data. We wanted to avoid additional data preprocessing, but now realize that using the updates from the competition folders might be a better solution, since there is always a possibility that APIs fail at some issue dates.