r-lib / here

A simpler way to find your files
https://here.r-lib.org/
Other
412 stars 44 forks source link

Non-R-dominant folder setup #85

Open sda030 opened 2 years ago

sda030 commented 2 years ago

We R users like to think that our RStudio Project file is in the top folder, and everything else is neatly organized in subfolders. In many research organizations and teams, not everyone uses R (nor is everything in a project about quantitative analysis). Hence, my Rproj files are often pushed into a subfolder called "R-analysis" or something. data folders might be located further up the directory structure. I therefore do not see how here can be of use for me and my colleagues (although I want to). Am I missing obvious solutions, or is this feature that should be added?

krlmlr commented 2 years ago

In a structure where your R project sits parallel to a "data" directory, can you use here("../data") ? Do we need to mention this in our documentation?

D3SL commented 9 months ago

The problem isn't documentation, it's #19, #21, #27, and #28 again.

In cases where the true root is higher or lower than the .rproj file you're basically back to the old paradigm of inputting full directories. In cases where the directory of the .rproj file won't exist such as quarto/markdown files, jupyter notebooks, shiny apps, and dockerized projects here can't be used at all because the calls to here() will be different in development and release.

Ironically the package has come full circle and is now even worse than the original problem

In contrast to using setwd(), which is fragile and dependent on the way you organize your files, here uses the top-level directory of a project to easily build paths to files.

setwd() is dependent on the way you organize your files, but it works seamlessly with any project structure. sda030 can have the .rproj file in their subdirectory and set their working directory to the true root one or more levels higher. Anyone dockerizing a shiny app can have the .rproj file in the docker container's build directory and set their working directory to the root folder of the shiny app itself, one level down. setwd() will break any time the directory structure changes, but it can be repaired to work with any new structure.

here() and rprojroot are also dependent on the way you organize your files, but unlike setwd() they will only work with one single project structure.

In fact in your own example project (penguins.r) proves how the use of here is now strictly worse. Opening demo-project.Rproj will set your working directory to that file's location, meaning the call to readr::write_csv(palmerpenguins::penguins, here("data/penguins.csv")) could be written as readr::write_csv(palmerpenguins::penguins, "data/penguins.csv") without any difference in outcome.

So for the original user here doesn't provide any benefit, and if that report is deployed and someone downloads it to a slightly different directory structure then here() will break permanently:

 /home/r_reports
 ├── r_reports.Rproj
 └── penguin_report
             ├── analysis
             │      └── report.Rmd
             ├── data
             │      └── penguins.csv
             └── prepare
                  └── penguins.R

In this case here will always latch on to the r_reports.rproj file no matter what the new user does. The only way to fix this is to rewrite every call to here() or change their entire directory structure. On the other hand if the .here file were authoritative, as people have been requesting for 7 years, all anyone would need to do is put a .here file in the root directory of penguin_report and it would seamlessly work everywhere.

krlmlr commented 9 months ago

Thank you for your perspective. It's difficult for me to see what you're requesting, and also the relationship to the original post.

What do you mean by "if the .here file were authoritative"? The exact criteria for a directory being a project root are listed in https://here.r-lib.org/reference/here.html#project-root , and placing a .here file into the penguin_report directory in your last example should also work. What am I missing?

D3SL commented 9 months ago

It's difficult for me to see what you're requesting

A .here file should override everything else. A .here file must be manually created by the user, so whenever one exists it means that location was deliberately chosen as the code's root folder.

the relationship to the original post.

As sda030 said: "...my Rproj files are often pushed into a subfolder called "R-analysis" or something. data folders might be located further up the directory structure."

Currently here will lock on to the .rproj file and force that to be the root for here() no matter what. They can not place a .here file at the true project root and have that override the .rproj file.

placing a .here file into the penguin_report directory in your last example should also work. What am I missing?

That's the issue, if this is supposed to work it doesn't. At least not reliably. The presence of r_reports.rproj will lead to here() choosing /home/r_reports as root regardless of whether a user creates a .here file anywhere else. That's why since 2017 there have been so many repeated requests for .here files to work as the documentation would suggest, but as it stands .here files get ignored in favor of the other heuristics.

sda030 and I are both two sides of the same issue with this. In their case the true project root is one or more levels higher than rproj, in mine it's lower.

 /foo/
 ├── .here       //here will ignore this
 ├── data/ 
 |     └── penguins.csv
 └── penguin_r_code/
             ├── analysis/
             │      └── report.Rmd
             ├── penguins.rproj   // here() will pick this
             └── prepare/
                  └── penguins.R
 /foo/bar_project
 ├── container_project.Rproj  //here() will pick this
 ├── dockerfile
 ├── .git
 └── actual_shiny_app
             ├── data
             │      └── penguins.csv
             ├── global.r
             ├── server.r
             ├── ui.r
             └──.here  //here() will ignore this

The second example also presents a showstopping issue: When the container is built actual_shiny_app/ gets copied into the container, but not /foo/bar_project/. So in development the code must be written as here("actual_shiny_app","data","penguins.csv") but in deployment all calls to here() need to be changed to here("data","penguins.csv").

Any time code is shared between people or platforms with different .rproj locations every call to here() will be completely broken, and can't be repaired any way other than rewriting the code or restructuring the entire project on disk.

Based on your response that placing a .here file should work I'm wondering if this is all an extremely long-standing glitch.

krlmlr commented 9 months ago

Thanks. I'm seeing:

fs::dir_create("proj")
brio::write_file("", "proj/test.Rproj")

fs::dir_create("proj/here")
brio::write_file("", "proj/here/.here")

fs::dir_tree("proj", all = TRUE)
#> proj
#> ├── here
#> │   └── .here
#> └── test.Rproj

setwd("proj/here")
fs::dir_tree(all = TRUE)
#> .
#> └── .here
here::here()
#> [1] "/private/var/folders/dj/yhk9rkx97wn_ykqtnmk18xvc0000gn/T/RtmpGQICa8/reprex-e5cf6becf7a7-fishy-coral/proj/here"
here::dr_here()
#> here() starts at /private/var/folders/dj/yhk9rkx97wn_ykqtnmk18xvc0000gn/T/RtmpGQICa8/reprex-e5cf6becf7a7-fishy-coral/proj/here.
#> - This directory contains a file ".here"
#> - Initial working directory: /private/var/folders/dj/yhk9rkx97wn_ykqtnmk18xvc0000gn/T/RtmpGQICa8/reprex-e5cf6becf7a7-fishy-coral/proj/here
#> - Current working directory: /private/var/folders/dj/yhk9rkx97wn_ykqtnmk18xvc0000gn/T/RtmpGQICa8/reprex-e5cf6becf7a7-fishy-coral/proj/here

setwd("..")
fs::dir_tree(all = TRUE)
#> .
#> ├── here
#> │   └── .here
#> └── test.Rproj
here::here()
#> [1] "/private/var/folders/dj/yhk9rkx97wn_ykqtnmk18xvc0000gn/T/RtmpGQICa8/reprex-e5cf6becf7a7-fishy-coral/proj/here"
here::dr_here()
#> here() starts at /private/var/folders/dj/yhk9rkx97wn_ykqtnmk18xvc0000gn/T/RtmpGQICa8/reprex-e5cf6becf7a7-fishy-coral/proj/here.
#> - This directory contains a file ".here"
#> - Initial working directory: /private/var/folders/dj/yhk9rkx97wn_ykqtnmk18xvc0000gn/T/RtmpGQICa8/reprex-e5cf6becf7a7-fishy-coral/proj/here
#> - Current working directory: /private/var/folders/dj/yhk9rkx97wn_ykqtnmk18xvc0000gn/T/RtmpGQICa8/reprex-e5cf6becf7a7-fishy-coral/proj

Created on 2024-01-17 with reprex v2.0.2

One thing to keep in mind is that {here} won't change its opinion if the working directory is changed after the package is loaded. This is intended, and (I believe) has been discussed before.

Can you please construct a similar example for your use case?

D3SL commented 8 months ago

This a standard dockerized shiny app. Loading the R Project, in order to have access to version control that includes the dockerfile, sets the working directory to parent_dir.

parent_dir       //this is the working directory in development
├── .git
├── .Rproj.user
├── Dockerfile
├── shiny_app.Rproj
├── Rprofile.site
└── shiny_app     //only this directory is copied into the app on deployment
    ├── data
    ├── functions
    ├── global.r
    ├── modules
    ├── server.R
    └── ui.R

And this is the diagnostic report:

> dr_here()
here() starts at C:/foo/bar/baz/parent_dir.
- This directory contains a file matching "[.]Rproj$" with contents matching "^Version: " in the first line
- Initial working directory: C:/foo/bar/baz/parent_dir
- Current working directory: C:/foo/bar/baz/parent_dir

As you demonstrated the only way to make here work when the desired root is not the same as the .rproj file is using setwd(). The only time it works without relying on setwd() is when the desired root is already the highest level folder and contains the .rproj file.

In the former here is moot because you're already relying on setwd(), and in the latter it's redundant because loading the .rproj file will change your working directory automatically. In either case there's no reason to use here. It will either do nothing, or be strictly worse by requiring every call to here() be changed rather than a single call to setwd().

However if the package were changed so the .here file overrides everything else it would continue to work seamlessly for anyone relying on .rproj files while also actually behaving as described: Code could be kept identical between development and deployment, across platforms, in mixed-language teams, without anyone needing to rely on fragile conditional calls to setwd().

The difference is one of design philosophy. Whether a package is intended to be pro-user and enable open collaboration or anti-user and lock people in to proprietary practices. Like how Microsoft Office breaks standards and locks people in to their proprietary formats while libreoffice doesn't.

krlmlr commented 8 months ago

The here package doesn't descend into subdirectories, I don't see how your use case as I understand it might work with here().

However, it might work with rprojroot, the package that powers the here package:

fs::dir_create("parent_dir")
brio::write_file("", "parent_dir/test.Rproj")

fs::dir_create("parent_dir/shiny_app")
brio::write_file("", "parent_dir/shiny_app/server.R")
brio::write_file("", "parent_dir/shiny_app/ui.R")

fs::dir_tree("parent_dir", all = TRUE)
#> parent_dir
#> ├── shiny_app
#> │   ├── server.R
#> │   └── ui.R
#> └── test.Rproj

crit <- rprojroot::has_basename("shiny_app", subdir = "shiny_app")
here <- crit$find_file

setwd("parent_dir")
fs::dir_tree(all = TRUE)
#> .
#> ├── shiny_app
#> │   ├── server.R
#> │   └── ui.R
#> └── test.Rproj
here("ui.R")
#> [1] "/private/var/folders/dj/yhk9rkx97wn_ykqtnmk18xvc0000gn/T/Rtmp82ViI8/reprex-106563780a467-smoky-moose/parent_dir/shiny_app/ui.R"

setwd("shiny_app")
fs::dir_tree(all = TRUE)
#> .
#> ├── server.R
#> └── ui.R
here("ui.R")
#> [1] "/private/var/folders/dj/yhk9rkx97wn_ykqtnmk18xvc0000gn/T/Rtmp82ViI8/reprex-106563780a467-smoky-moose/parent_dir/shiny_app/ui.R"

Created on 2024-01-22 with reprex v2.0.2

D3SL commented 8 months ago

However, it might work with rprojroot, the package that powers the here package:

My understanding from the documentation was that rprojroot operates the same way, but with less syntactic sugar for the end user. If there is a qualitative difference in functionality that could be the solution for a lot of us who've been looking for something else from here.

Going by your example: if rprojroot is called and set in a global.r inside /parent_dir/shiny_app/ will it work both in development where the parent_dir and test.rproj exists and also in deployment where the parent directory is dropped and /shiny_app is at root?

krlmlr commented 8 months ago

Have you seen https://here.r-lib.org/articles/here.html#under-the-hood-rprojroot?

I believe that the reprex I shared is setting up a project structure similar to the one that you describe. At this stage, I'd say it's worth trying out?