CredibilityLab / groundhog

Reproducible R Scripts Via Date Controlled Installing & Loading of CRAN & Git Packages
https://groundhogr.com/
GNU General Public License v3.0
78 stars 4 forks source link

Error when package conflict in 'depends' and package loaded by groundhog #61

Closed wildingm closed 3 years ago

wildingm commented 3 years ago

I have a personal package that has a few functions I use internally (and that I hope others will begin to use soon) - it is installed via my organisations GitLab repo. This package depends on dplyr to function.

I have a script I'm converting to use groundhog. If I groundhog.library(tidyr, groundhogDay) and then try to load my package I get an error as tidyr has loaded dplyr in the background, and my package can't then load the version of dplyr it wants to.

Will you be considering how to use groundhog with non-cran packages to allow this sort of thing to work? Or is there a workaround for this sort of situation to point a library(,package>) to the groundhog loaded version of a depended upon package?

urisohn commented 3 years ago

If groundhog gets widely adopted, one possibility is to provide archival hosting of non-CRAN pacakges so that groundhog can load those too, but that is probably at least 6 months away, if it happens.

I can think of two solutions for your right now (I would recommend the 1st) 1) Use as the groundhog.day the date that will lead to the version of dplyr your own package is using. With groundhog::toc('dplyr') you can see when each version was released to identify the date you want So then, when you use groundhog first, it will not create a conflict.

2) You can load your package first, and then tell groundhog to tolerate the dependency mismatch with the optional argument 'ignore.deps=c("dplyr")' This will load everything as current on the groundhog.day you chose, but will skip 'dplyr' and leave the version that you have.

I think (1) is better because it is more reproducible, though presumably your own package is loading whatever version of 'dplyr' you have available in your library, so if you give your code to someone else, they will be loading a different version of dplyr.

wildingm commented 3 years ago

Thanks, that's helpful. I'll have a think about which approach will work best for us, I'd tend to agree that 1 is easier to manage, but 2 might be more stable on multiple systems.

I've shared your blog internally as I think this approach will be easier to implement than other ways of managing packages. I hope you gain some traction.