STAT545-UBC / Discussion

Public discussion
38 stars 20 forks source link

A question on working directories #317

Open sagbayani opened 8 years ago

sagbayani commented 8 years ago

Hi, I have a question on working directories...

If we are placing all our [multiple] R projects in one folder (e.g. D:\STAT545), assuming each R project is in a folder of it's own...

Can/Should we change the working directory for R to D:\STAT545? or should it be D:\STAT545\projectfolder? i.e. should we change the wd with every project we work on, or should it remain the same throughout?

I notice that the default wd is C:\Users\username\documents (windows). What if I've been working under that default directory, but saving all my projects under a different directory? Is that a good/bad scenario?What happens if I change the wd halfway through? Will changing it "break" anything?

Thanks in advance, GitHub!

oganm commented 8 years ago

By default the working directory is the project directory. It is only natural to keep it that way. The only rationale to do otherwise is if you need files from other projects but then are you sure they should be independent projects? It is simply more cumbersome to have a constant working directory as you have to either set it manually for all projects, all have a .Rprofile file doing it for you.

Also remember that these will be different github repositories. When we are grading your work, or when anyone is cloning your repos for any reason, we/they won't have your parent repository. So it is best to stay at the project directory for reproducibility purposes. If you need to grab a file from another repo, add it to the current one instead of accessing a different project.

sagbayani commented 8 years ago

Thanks, @oganm
So to clarify, if the wd is not my current project directory, then I should setwd() to current project directory every time?

samhinshaw commented 8 years ago

Yes, though the default behavior when opening a project is to switch to the directory that the .Rproj file lives in (aka the "project root directory"). If you are NOT working in a project, then you can change the default directory that RStudio uses via Tools -> Global Options: global options

The main idea is to not have multiple .Rproj in the same folder, or nested .Rproj files. The same goes for repositories! However, it is fine to have subfolders within one repository/project.

For example, you could have something like... D:\STAT545\stat545.Rproj\ D:\STAT545\homework01\ D:\STAT545\homework02\ D:\STAT545\.git\

But NOT: D:\STAT545\stat545.Rproj\ D:\STAT545\.git\ D:\STAT545\homework01\ D:\STAT545\homework01\.git\ D:\STAT545\homework01\homework01.Rproj D:\STAT545\homework02\ D:\STAT545\homework02\.git\ D:\STAT545\homework02\homework02.Rproj

jennybc commented 8 years ago

I highly recommend always working in

End of story. When you work on project A, launch RStudio Project A. The working directory will be set to the correct directory. And don't mess with it.

You should almost never use setwd(). It makes your work unportable, because no one else with have C:\Users\johndoe\documents\eggplant\banana or whatever. Seeing setwd() at the top of script always fills me with a bit of dread.

Build paths to the files you need relative to top-level directory of the project. I recommend just letting a bunch of files lives there together for a while, before complicating matters.

Yes, eventually, projects will need sub-directory structure. I can talk about path-building strategies in class sometime. In the meantime, I highly recommend these two packages for building paths in a robust way:

sagbayani commented 8 years ago

Sounds good - thanks for clarifying!