Open ChrisDenton opened 3 months ago
For the status quo you wrote
the current application's parent directory
but then for the proposal this becomes
the application's directory
That sounds like it's not the same thing? However, you're not discussing changing that part of the behavior either, so maybe this is meant to refer to the same thing in both cases? I am confused.
GitHub actions has (WSL) bash.exe in C:\Windows\System32 (the system directory). However, some people want to use (git) bash.exe in C:\Program Files\Git\bin. Currently there is no way to override it (unless you know the env hack, which isn't great).
Oh, that may explain some extremely painful and confusing behavior I experienced on GHA recently. Having the system dirs overwrite PATH is certainly surprising and seems to be inconsistent both with Unix behavior and with the normal Windows behavior, so this is definitely something that should be changed.
There are competing demands in fixing this to be more predictable. Some people want this to be as close to the "normal" Windows behaviour as possible, others strongly favour cross-platform consistency.
It would be good to discuss where on the spectrum of "work like Unix" vs "work like normal Windows programs" this proposal lies.
That sounds like it's not the same thing? However, you're not discussing changing that part of the behavior either, so maybe this is meant to refer to the same thing in both cases? I am confused.
I've reworded to hopefully avoid confusion.
It would be good to discuss where on the spectrum of "work like Unix" vs "work like normal Windows programs" this proposal lies.
I've done a small edit to clarify that comment a bit and I'll attempt to expand upon the point later when I have time to write something more succinct. But here's the long version:
PATH
(and only PATH
) for the executable to load. PATH
to find executables. This is effectively the same as Linux, except that cmd by default will also run applications in the current directory (though this is configurable and powershell does not do this).CreateProcess
has its own way of finding a executable if it's not given explicitly (scroll down that page to "1. The directory from..."), Most languages use this because they're relatively thin wrappers around this API.I'll repost the CreateProcess
behaviour here. To be clear the current directory and environment variables are that of the parent process:
The Unix behaviour is:
If Command::env
is not used then Rust currently uses a modified version of the CreateProcess
behaviour. It removes 2.
for security reasons and it also remove the 16-bit system directory. So it's:
If Command::env
is used then it tries to mimic the Unix behaviour then falls back to the above if the application is not found:
The suggested change simplifies this to always doing this:
Due to unifying the two behaviours, this leads to being slightly less Unix-like in the case where Command::env
is used but slightly more Unix-like otherwise. To be honest, I would also prefer to pick only one out of 2.
and 3.
but I suspect that'd break someone.
Thanks for the clarifications!
The directory from which the application loaded.
Is that what you called "the parent process' directory" above? I thought "the parent process' directory" referred to the current working directory, but "from which the application loaded" sounds more like the directory containing current_exe()
?
The directories that are listed in the PATH environment variable.
Oh wow, so this is last normally on Windows? Seems like Windows users could be quite surprised then about Rust giving PATH a lot higher priority. Or maybe it's less "higher priority" and more "skipping the system directories".
directory from which the application loaded
This is the directory in which the parent process executable resides, correct? Not the current working directory.
I think the simplification you recommend sounds good. I presume we're allowed to deduplicate (2) and (3) when applicable?
Thanks for the clarifications!
The directory from which the application loaded.
Is that what you called "the parent process' directory" above? I thought "the parent process' directory" referred to the current working directory, but "from which the application loaded" sounds more like the directory containing
current_exe()
?
Yes, that's right. By "current directory" I meant the current working directory and by "application's directory" I meant the directory containing the application as given by current_exe()
,
The directories that are listed in the PATH environment variable.
Oh wow, so this is last normally on Windows? Seems like Windows users could be quite surprised then about Rust giving PATH a lot higher priority. Or maybe it's less "higher priority" and more "skipping the system directories".
The system directories are in the PATH
and usually before the user's paths. So from a practical perspective it makes little difference unless the user has modified their PATH to remove them.
And as noted, shells don't do that so there's already inconsistency.
directory from which the application loaded
This is the directory in which the parent process executable resides, correct? Not the current working directory.
I think the simplification you recommend sounds good. I presume we're allowed to deduplicate (2) and (3) when applicable?
Sure but that's a libs optimization, rather than a libs-api question.
And as noted, shells don't do that so there's already inconsistency.
Yeah seems to be quite the mess.
From a Unix perspective, the most surprising part to me is "The directory from which the application loaded". I guess that's just a sufficiently common pattern on Windows that we have to support it?
Yes, I believe so. Windows applications are generally packaged in their own directory rather than being in a soup of other applications. So, at least traditionally, giving high priority and trust to the application's load directory is expected.
Proposal
Problem statement
On Windows, the
Command
program search uses slightly different rules depending on if the environment is inherited or not. To cut a long store short, this is for historical reasons. Way back in pre-1.0 times the intent was to follow Linux search rules. However, after several refactorings this became super buggy and only really worked ifCommand::env
was used. In other cases it (unintentionally) fell back to the standard CreateProcess rules.Awhile back it was decided to remove the current directory from the Windows
Command
search path. Which I did. At the time I was a bit worried it would affect people. But as it turned out that didn't appear to have much of an impact on anyone. Or at least I've not heard of anyone having issues.I did however preserve some of the buggy
env
behaviour because, I was worried about making too many changes at once.. However, I do think it needs fixing somehowMotivating examples or use cases
Assuming that there is an app called
hello.exe
alongside the current application and also a differenthello.exe
inPATH
:Solution sketch
To be clear, currently an exe is searched for in:
PATH
but only if the child environment is not inherited.PATH
Obviously this leads to some inconsistencies depending on whether
Command::env
is used or not.There are competing demands in fixing this to be more predictable. Some people want this to be as close to the "normal"
CreateProcess
behaviour as possible (as is common in other Windows languages), others strongly favour cross-platform consistency. We may also need to be aware of people unintentionally relying on the existing behaviour (even if nobody actually likes it).Trying to keep everyone happy is difficult, if not impossible but I think we can do better than we currently are. With that in mind, I would like to simplify the search order to:
PATH
, if set.PATH
I've moved the parent process' directory first to be more consistent. This would be the behaviour generally expected on Windows and it's essentially a dynamic path so can't be in
PATH
. I've also removed the system paths as these will be early in thePATH
by default and if not then that is the user's choice. We don't generally guard against even an insecurePATH
(e.g. adding the current directory toPATH
is totally fine). Also note that Powershell does not search the system paths beforePATH
.An example of when hardcoding the system path is frustrating is
bash.exe
. GitHub actions has (WSL)bash.exe
inC:\Windows\System32
(the system directory). However, some people want to use (git)bash.exe
inC:\Program Files\Git\bin
. Currently there is no way to override it (unless you know theenv
hack, which isn't great).I've left in potentially searching
PATH
twice, albeit modified so that only if child'sPATH
is explicitly set will it be searched. While I'd love to pick just onePATH
or t'other, I am worried about compatibility.Alternatives
PATH
is only searched if actually set (not just any random env var)PATH
(the child's or the parent's... or both)Links and related work
What happens now?
This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.
Possible responses
The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):
Second, if there's a concrete solution: