Open jonsequitur opened 2 years ago
Copied from @ptr727's comment on #1781:
Problem with the string[] args is that it breaks when using e.g.
"D:\"
, where the " was treated as an embedded quote, not a terminator quote.Also, the vanilla args[] does not (did not) support unicode or UTF8 characters (critical when piping between processes), thus one was forced to use the W/unicode API to get the original commandline, then parse by hand.
With the introduction of .NET Core, I asked, and the team did not want to change the behavior from how WIN32 in C++ works, even if broken.
The McMaster, gsscoder, and can't remember the other CLI library I used before did not want to include the "correct" parser as I referenced on stackoverflow, thus myself and others just created the manual parsing code, and ignored the broken args[] behavior. Found one: https://github.com/gsscoder/commandline/issues/473
Now, things may have changed, maybe .NET Core now correctly parses in args[], but that may be something you could verify, I've given up, or more I can't be bothered to try to get a team to fix something I think is/was broken.
And yep, https://github.com/dotnet/command-line-api/issues/1758 sorta describes the issues with not correctly parsing. See: https://github.com/ptr727/Utilities/blob/main/UtilitiesTests/CommandLineTests.cs Also the stackoverflow article has grown to be very complete.
I hope great @mklement0 review https://github.com/PowerShell/PowerShell/issues/15143 and his module helps to resolve the issue.
This won't work without breaking supported scenarios.
@treenewlyn is asking for a way to escape the quotes in his command, and the existing way of doing that is with backslash, which makes sense.
But we currently support tokens that are quoted and have an ending backslash, one example is a windows path that ends in separator. if we introduce the escaping logic that @treenewlyn is asking for, those cases would consider the "ending separator + quote" as a quote and fail. https://github.com/dotnet/command-line-api/blob/e965ae227624f1b1e53a732a7835641452a68c69/src/System.CommandLine.Tests/Parsing/CommandLineStringSplitterTests.cs#L29-L37
You can't have both, and since we already have tests showing that windows paths with ending separator are used, I don't know what's the right approach to this. Either do nothing, or take the breaking change.
I already tried a couple of times to reconcile both scenarios, but failed.
For reference, this is what CMD does with dotnet publish \"xxx.csproj\" -c Release -o \"./bin/latest\" -r linux-x64 --self-contained false
:
You can see it works as OP is asking for.
C:\my-app>dotnet run "dotnet publish \"xxx.csproj\" -c Release -o \"./bin/latest\" -r linux-x64 --self-contained false"
foreach (string arg in args): dotnet publish "xxx.csproj" -c Release -o "./bin/latest" -r linux-x64 --self-contained false
Environment.CommandLine: C:\my-app\bin\Debug\net7.0\consoleapp.dll "dotnet publish \"xxx.csproj\" -c Release -o \"./bin/latest\" -r linux-x64 --self-contained false"
Environment.GetCommandLineArgs(2): [C:\my-app\bin\Debug\net7.0\consoleapp.dll,dotnet publish "xxx.csproj" -c Release -o "./bin/latest" -r linux-x64 --self-contained false]
And this with rm -r "c:\temp files\"
, if you don't escape the quotes:
It concats "rm -r "c:\temp
and then splits files\"
.
C:\my-app>dotnet run "rm -r "c:\temp files\""
foreach (string arg in args): rm -r c:\temp,files"
Environment.CommandLine: C:\my-app\bin\Debug\net7.0\consoleapp.dll "rm -r c:\temp" "files\""
Environment.GetCommandLineArgs(3): [C:\my-app\bin\Debug\net7.0\consoleapp.dll,rm -r c:\temp,files"]
And if I escape the quotes rm -r \"c:\temp files\\"
:
Is surprisingly deleting the ending \
, not even using it as escape character for "
.
C:\my-app>dotnet run "rm -r \"c:\temp files\""
foreach (string arg in args): rm -r "c:\temp files"
Environment.CommandLine: C:\my-app\bin\Debug\net7.0\consoleapp.dll "rm -r \"c:\temp files\""
Environment.GetCommandLineArgs(2): [C:\my-app\bin\Debug\net7.0\consoleapp.dll,rm -r "c:\temp files"]
Since -r is meaningful for dotnet cli, I also tried the following dotnet run "c:\temp files\"
:
In this one you can see that the quote is preserved due to the ending \
acting as a escape character, which is inconsistent with the previous example.
C:\my-app>dotnet run "c:\temp files\"
foreach (string arg in args): c:\temp files"
Environment.CommandLine: C:\my-app\bin\Debug\net7.0\consoleapp.dll "c:\temp files\""
Environment.GetCommandLineArgs(2): [C:\my-app\bin\Debug\net7.0\consoleapp.dll,c:\temp files"]
If CMD uses CommandLineToArgvW
under the covers, Then we are already doing something different to that function, so I wonder if we should keep striving to behave like it or not?
my-app
's code looks as the following:
Console.Write("foreach (string arg in args): ");
for (int i = 0; i < args.Length; i++)
{
Console.Write(args[i]);
if (i < args.Length - 1)
Console.Write(",");
}
Console.WriteLine();
Console.WriteLine();
Console.Write("Environment.CommandLine: ");
Console.Write(Environment.CommandLine);
Console.WriteLine();
Console.WriteLine();
var cmdLineArgs = Environment.GetCommandLineArgs();
Console.Write($"Environment.GetCommandLineArgs({cmdLineArgs.Length}): ");
Console.Write('[' + string.Join(',', cmdLineArgs) + "]");
Console.WriteLine();
If CMD uses CommandLineToArgvW under the covers
.NET Runtime can call CommandLineToArgvW here https://github.com/dotnet/runtime/blob/aaf9c8a7702920c4999e0535939e981038bbf42c/src/libraries/System.Private.CoreLib/src/System/Environment.Windows.cs#L220
CreateProcess in Windows takes a command-line string and the new process decides whether and how to split it to arguments. IIRC, some C library passes a list of argument strings via the cbReserved2 and lpReserved2, but I don't know whether .NET Runtime reads those at all.
I feel the completion function in the shell should parse quotation marks etc. as needed, and pass to dotnet-suggest a list of strings, the position of the string that contains the insertion point, and the position of the insertion point within that string. It should then read the resulting completions and add any shell-dependent backslashes or quotation marks. That way, dotnet-suggest and the completion sources would not need to support the syntaxes of individual shells.
OTOH, if the shell scripting makes this too slow, then it can be reimplemented in dotnet-suggest -- still not in the completion sources. And this should be an optional optimisation; the argv array mechanism should still be available for shells that dotnet-suggest does not explicitly support.
Pushing these responsibilities to the shell is ideal, but it has a longer timeline. Implementing per-shell solutions should be more performant (and shouldn't even require dotnet-suggest
or any other intermediate .NET-based process), but it's a lot more work and opens up a large test matrix. A design where System.CommandLine can handle this parsing will remain a useful fallback as support for specific shells comes in.
Other scenarios this method also addresses:
These are a few examples of places where a command line parser is useful outside of the Main
method in a CLI app.
The
CommandLineStringSplitter
provides a way to split astring
into astring[]
and is intended to conform to the way that strings are split on the command line.It has two primary use cases:
Parse(string[] args)
orInvoke(string[] args)
can be error prone because it requires people to make assumptions about how a command line will actually be split when your app is started, and the actual rules are not always intuitive.CommandLineStringSplitter
is critical for completions, since the only way to capture the text on the command line from within the launched .NET application is to pass it in as a complete string from the shim shell script and then split it into the equivalentstring[] args
array thatMain
receives.Various shells will exhibit different behaviors here due to different quote escaping rules, so the user's experience of this splitting logic might not match up with a single implementation of
CommandLineStringSplitter
. Because of this, and to eventually provide higher fidelity for the completions experience on different shells in the presence of escapes on the command line, it might be useful to provide multiple implementations over time and we should consider anticipating that need in the API design now.Resources
Related
For the purposes of illustration in examples below, I'll be using the following program to confirm behaviors that differ among various shells: