bnosac / taskscheduleR

Schedule R scripts/processes with the Windows task scheduler.
331 stars 72 forks source link

Support path expansion for rscript argument #81

Closed yogat3ch closed 1 year ago

jwijffels commented 3 years ago

On which windows machine did you install R at folder //~

yogat3ch commented 1 year ago

@jwijffels Those slashes are escape characters because it's a regex. That regex establishes if the file path starts with a leading tilda (R's way of abbreviating the R root directory), and if so expands it because the terminal command requires the full path.

Sorry for the massive delay on this, just now going through all old open PRs

jwijffels commented 1 year ago

I'm getting, what are you getting?

> file.path(Sys.getenv("R_HOME"), "bin", "Rscript.exe")
[1] "C:/PROGRA~1/R/R-41~1.2/bin/Rscript.exe"
> path.expand(file.path(Sys.getenv("R_HOME"), "bin", "Rscript.exe"))
[1] "C:/PROGRA~1/R/R-41~1.2/bin/Rscript.exe"

I never saw on Windows a path with ~

yogat3ch commented 1 year ago

Hi @jwijffels, If you had a script in your RHOME directory and you wanted to refer to it in R, you can specify a path like this "~/myfolder". It's common practice for people who use R to use this tilda abbreviated form of a file path. If this rscript argument character string "~/myfolder" is passed along verbatim to the system command, it will return strange errors because it doesn't like the syntax of the file path.

To accommodate this typical formatting of file paths in R, we use path.expand to fully expand the path and avoid this issue.

You can use this in place of the warning asking the user to provide a full file path, as the path.expand argument will expand it automatically.

However, this doesn't account for the user supplying a relative path to the rscript. If you wanted to ensure that the path is always an absolute path from the root directory, you could use fs::path_abs(path.expand(rscript))

For a reprex to understand what I'm talking about simply run the following: path.expand("~/my_file.R")

Or open R in a project directory and run: fs::path_abs("my_file.R")

This provides better handling of file paths that most R users are used to having done under the hood in the tidyverse. Rarely is a user asked to provide a full absolute path, as that can, and typically should be handled inside the function.

Does that help to clarify?

jwijffels commented 1 year ago

The windows taskscheduler requires a full path, not a relative path, I think it is better to warn the user upfront on that instead of hiding the details of paths otherwise they are surprised it doesn't work if they change their paths / directories or they might expect R package taskscheduleR to do path/folder changes behind the scenes (which it doesn't). I'm very reluctant to incorporate this for that reason.

yogat3ch commented 1 year ago

@jwijffels This is the only time I've ever seen a package that did not properly handle tilda expansion or relative paths properly. There's also syntactical nuances for file paths with system commands depending on the particular machine that make it easy to incur human error. The way it currently is seems like an exception and requires R users have knowledge about their particular system commands file path formatting syntax. I think this isn't common knowledge amongst most users working with R and creates a unnecessary barrier to entry for people to use this package.

What about adding to the docs for that argument:

path is expanded with \link[base]{path.expand} and \link[fs]{path_abs}
jwijffels commented 1 year ago

There's also syntactical nuances for file paths with system commands depending on the particular machine that make it easy to incur human error.

What exactly are you referring to? Can you give examples?

R users have knowledge about their particular system commands file path formatting syntax

Yes that is exactly the idea, that they understand they are working on a fixed filesystem instead of trying to hide that.

yogat3ch commented 1 year ago

@jwijffels Check out .Platform$file.sep - it's a variable set by R on installation that indicates what the separator is in file paths for that machine. By letting R handle the file paths, the file separator is accurate.

If you have people working on a virtual develop environment, they may be working on a machine with an OS they've never used before. Forcing them to include the full path to the file creates a barrier to entry whereby they'd have to know about .Platform$file.sep and know how to figure out where their project is located on the host machine (if you're not going to expand relative paths).

I'm not sure why it's so difficult to understand that not everyone has the same familiarity with their machine as you seem to? It's currently pedantic and not very user-friendly.

jwijffels commented 1 year ago

Check out .Platform$file.sep - it's a variable set by R on installation that indicates what the separator is in file paths for that machine. By letting R handle the file paths, the file separator is accurate.

I really don't understand how this has anything to do with this pull requests which involves path.expand. path.expand as in https://github.com/wch/r-source/blob/35cc16f2d7a0f499776a7c051ef8a82b59168a1f/src/gnuwin32/sys-win32.c#L68-L129 does not do anything if you don't use the .Platform$file.sep in path.expand .

> path.expand("~/subfolder")
[1] "C:/Users/jwijf/OneDrive/Documents/subfolder"
> path.expand("~subfolder")
[1] "~subfolder

So that means you know the separator should be / and not \ unless you would type path.expand(sprintf("~%ssubfolder", .Platform$file.sep)).

with an OS they've never used before

This is a Windows-only package.

I'm not sure why it's so difficult to understand that not everyone has the same familiarity with their machine as you seem to? It's currently pedantic and not very user-friendly.

I understand not everyone has the same familiarity with the file systems, nor do I if I encounter a new virtual machine. The point is that the package owner wants the user to understand that he/she is working on a file system and does not want to maintain (behind the scenes) his/her file system or drives or directories nor the full paths to it if someone changes these either on a VM with a shared drives or not or if he/she decides to put the file on a path with non-ascii UTF-8 characters or a too long path for the system of when they encounter the file system is different if they run the script with a different user. As said the main point is:

The windows taskscheduler requires a full path, not a relative path, I think it is better to warn the user upfront on that instead of hiding the details of paths otherwise they are surprised it doesn't work if they change their paths / directories or they might expect R package taskscheduleR to do path/folder changes behind the scenes (which it doesn't). The package owner does not want to hide these details from the user

Nothing stops you from using file.path, system.file, path.expand, fs::path_abs before using taskscheduler_create. If you want to create a function which works with all possible paths you encounter in the wild based on the directory they are working in and which works the way you describe, feel free to create one, call it taskscheduler_rscript similar ascronR::cron_rscript such that it can be passed on to taskschedule_create and make a pull request. I'd be happy to include that.