MichaelChirico / r-bugs

A ⚠️read-only⚠️mirror of https://bugs.r-project.org/
20 stars 0 forks source link

[BUGZILLA #16921] Rscript does not load the methods package by default #6239

Closed MichaelChirico closed 4 years ago

MichaelChirico commented 4 years ago

Created attachment 2093 [details] Patch to include the methods package in Rscript defaults

This is a proposal to change the default behavior of Rscript to load the methods package by default. This has been a frequent issue when code that runs in an interactive session or by calling R directly fails to run when using Rscript and it is not intuitively obvious to users why the code fails.

In addition, the performance impact is now largely attenuated due to lazy loading of package functions in recent R versions.

for i in {1..5}; do time R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e 'invisible()';done
R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e   0.08s user 0.05s system 95% cpu 0.137 total
R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e   0.08s user 0.05s system 95% cpu 0.132 total
R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e   0.08s user 0.05s system 95% cpu 0.132 total
R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e   0.08s user 0.04s system 95% cpu 0.128 total
R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e   0.09s user 0.05s system 94% cpu 0.139 total
for i in {1..5}; do time R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript -e 'invisible()';done
R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript   0.14s user 0.05s system 96% cpu 0.195 total
R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript   0.14s user 0.05s system 95% cpu 0.193 total
R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript   0.13s user 0.05s system 96% cpu 0.187 total
R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript   0.13s user 0.05s system 96% cpu 0.191 total
R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript   0.13s user 0.05s system 96% cpu 0.191 total

The attached patch changes the default to 'datasets,utils,grDevices,graphics,stats,methods' and updates the documentation of ?Rscript and in R-admin. If there are other places where changes are necessary I am happy to provide updated patches.


METADATA

MichaelChirico commented 4 years ago

That is (was?) a conscious design decision, at least when Rscript was released.

[ Disclaimer: I am not unbiased here as littler came out a little earlier, always loaded the methods package, and was (and still is) still quicker to start. ]


METADATA

MichaelChirico commented 4 years ago

Thanks for raising this issue. Rscript was designed for running short snippets of R code in the context of programs like R CMD check. It sounds like you've encountered use cases where more complex programs are invoked via the command line. Would you please provide some examples of such cases? Thanks in advance.


METADATA

MichaelChirico commented 4 years ago

Easy: Any job you may want to run from, say, cron that involves S4.

Works interactively in R (as methods is loaded), bombs immediately via Rscript. I never understood how that is supposed to make sense, and I don't even program with S4.


METADATA

MichaelChirico commented 4 years ago

Yes, that is pretty obvious, but a lot of times scripts use S4 through packages that they are already attaching. Only things like as() might cause a problem. Personally, I think consistency should win here, but others have a different opinion.


METADATA

MichaelChirico commented 4 years ago

I have run into this a number of times personally, two recent ones were a user seeing different coverage results locally than on Travis (https://github.com/jimhester/covr/issues/180) and coverage of the testthat package being lower than expected (https://github.com/hadley/testthat/pull/475).

Some more examples of people running into similar issues using a simple github search (https://github.com/search?utf8=%E2%9C%93&q=Rscript+methods+is%3Aissue)

I am not familiar enough with S4 dispatch to know the exact cases when this poses a problem, but anecdotally it seems to cause problems anytime you call a package using S4 from Rscript as Dirk mentioned.

My feeling is more time has been wasted by users trying to debug the inconsistency than has been gained by the <60ms difference in startup time (from my benchmarks). This may have been a more reasonable trade-off when package loading was slower, but the slowdown with modern R seems slight to me.


METADATA

MichaelChirico commented 4 years ago

I also explained maybe half a dozen times on StackOverflow.

The un-intuitive nature of this is, as Jim argued, indeed less than helpful. I would welcome a reversal which makes Rscript closer to standard R.


METADATA

MichaelChirico commented 4 years ago

Thanks for these examples. While some of them have helpfully exposed bugs in the methods package or other packages, many of them do argue for consistency.


METADATA

MichaelChirico commented 4 years ago

I agree that interactive and non-interactive modes ought to use the same defaultPackages.

I suppose removing methods from the list of default packages in interactive mode would be too disruptive, so that implies the lesser evil is to add it for non-interactive mode.

On the other hand, the fact that some bugs would have gone undetected, had this already been the default, is something to consider.


METADATA

MichaelChirico commented 4 years ago

This can now be closed; as of revision 74084, "Rscript now defaults to using the same default packages as R."


METADATA

MichaelChirico commented 4 years ago

Yes, many thanks to Luke for making that change.


METADATA

MichaelChirico commented 4 years ago

Kurt's suggestion to add the R_SCRIPT_DEFAULT_PACKAGES environment variable made this work.


METADATA