datacarpentry / R-ecology-lesson

Data Analysis and Visualization in R for Ecologists
https://datacarpentry.org/R-ecology-lesson/
Other
314 stars 508 forks source link

Suggestion: Add exception to the "Set Preferences to ‘Never’ save workspace in RStudio." recommendation #720

Closed cwatt closed 3 years ago

cwatt commented 3 years ago

Hi maintainers,

In the Before We Start, Getting Set Up section, I agree that we should recommend that learners turn off the automatic save workspace preference. But should we also give a brief disclaimer explaining that there are some cases where saving the workspace or critical intermediate files makes sense, ie having to pause work in the midst of analyzing a dataset with long computing times and/or many intermediate steps? At least, in my line of work this is the case fairly frequently, and it would be very frustrating to have to start over from the beginning each time I work on the analysis!

Thanks for the great lessons and all your hard work! Cassi

Teebusch commented 3 years ago

Hi @cwatt, thank you for your suggestion!

I think we should be teaching best practices as much as possible. In my opinion, saving the workspace -- as convenient as it may be -- isn't the best solution to the problem you are describing and, as the text points out, can cause some hard-to-debug errors. There are better (i.e., more explicit, less error-prone, more reproducible) solutions to store results of long-running, expensive computations. One solution would be to store and retrieve intermediate results yourself using, for example, write_rds() and read_rds(), another one would be to use a package like {targets} to keep track of intermediate results for you.

Anyways, perhaps we can meet halfway, and say "it's usually a good idea to turn off saving of the workspace"?

Current:

By default, all of these objects will be saved, and automatically loaded, when you reopen your project. Saving a workspace to .RData can be cumbersome, especially if you are working with larger datasets, and it can lead to hard to debug errors by having objects in memory you forgot you had. To turn that off, go to Tools –> ‘Global Options’ and select the ‘Never’ option for ‘Save workspace to .RData’ on exit.’

Suggested update:

By default, all of these objects will be saved, and automatically loaded, when you reopen your project. Saving a workspace to .RData can be cumbersome, especially if you are working with larger datasets, and it can lead to hard to debug errors by having objects in memory you forgot you had. Therefore, it's often a good idea to turn this off. To do so, go to Tools –> ‘Global Options’ and select the ‘Never’ option for ‘Save workspace to .RData’ on exit.’